138 research outputs found
swTVM: Exploring the Automated Compilation for Deep Learning on Sunway Architecture
The flourish of deep learning frameworks and hardware platforms has been
demanding an efficient compiler that can shield the diversity in both software
and hardware in order to provide application portability. Among the exiting
deep learning compilers, TVM is well known for its efficiency in code
generation and optimization across diverse hardware devices. In the meanwhile,
the Sunway many-core processor renders itself as a competitive candidate for
its attractive computational power in both scientific and deep learning
applications. This paper combines the trends in these two directions.
Specifically, we propose swTVM that extends the original TVM to support
ahead-of-time compilation for architecture requiring cross-compilation such as
Sunway. In addition, we leverage the architecture features during the
compilation such as core group for massive parallelism, DMA for high bandwidth
memory transfer and local device memory for data locality, in order to generate
efficient code for deep learning application on Sunway. The experimental
results show the ability of swTVM to automatically generate code for various
deep neural network models on Sunway. The performance of automatically
generated code for AlexNet and VGG-19 by swTVM achieves 6.71x and 2.45x speedup
on average than hand-optimized OpenACC implementations on convolution and fully
connected layers respectively. This work is the first attempt from the compiler
perspective to bridge the gap of deep learning and high performance
architecture particularly with productivity and efficiency in mind. We would
like to open source the implementation so that more people can embrace the
power of deep learning compiler and Sunway many-core processor
Intelligent-Unrolling: Exploiting Regular Patterns in Irregular Applications
Modern optimizing compilers are able to exploit memory access or computation
patterns to generate vectorization codes. However, such patterns in irregular
applications are unknown until runtime due to the input dependence. Thus,
either compiler's static optimization or profile-guided optimization based on
specific inputs cannot predict the patterns for any common input, which leads
to suboptimal code generation. To address this challenge, we develop
Intelligent-Unroll, a framework to automatically optimize irregular
applications with vectorization. Intelligent-Unroll allows the users to depict
the computation task using \textit{code seed} with the memory access and
computation patterns represented in \textit{feature table} and
\textit{information-code tree}, and generates highly efficient codes.
Furthermore, Intelligent-Unroll employs several novel optimization techniques
to optimize reduction operations and gather/scatter instructions. We evaluate
Intelligent-Unroll with sparse matrix-vector multiplication (SpMV) and graph
applications. Experimental results show that Intelligent-Unroll is able to
generate more efficient vectorization codes compared to the state-of-the-art
implementations
Visualizing the proteome of Escherichia coli: an efficient and versatile method for labeling chromosomal coding DNA sequences (CDSs) with fluorescent protein genes
To investigate the feasibility of conducting a genomic-scale protein labeling and localization study in Escherichia coli, a representative subset of 23 coding DNA sequences (CDSs) was selected for chromosomal tagging with one or more fluorescent protein genes (EGFP, EYFP, mRFP1, DsRed2). We used Ī»-Red recombination to precisely and efficiently position PCR-generated DNA targeting cassettes containing a fluorescent protein gene and an antibiotic resistance marker, at the C-termini of the CDSs of interest, creating in-frame fusions under the control of their native promoters. We incorporated cre/loxP and flpe/frt technology to enable multiple rounds of chromosomal tagging events to be performed sequentially with minimal disruption to the target locus, thus allowing sets of proteins to be co-localized within the cell. The visualization of labeled proteins in live E. coli cells using fluorescence microscopy revealed a striking variety of distributions including: membrane and nucleoid association, polar foci and diffuse cytoplasmic localization. Fifty of the fifty-two independent targeting experiments performed were successful, and 21 of the 23 selected CDSs could be fluorescently visualized. Our results show that E. coli has an organized and dynamic proteome, and demonstrate that this approach is applicable for tagging and (co-) localizing CDSs on a genome-wide scale
Recommended from our members
An animal model of SARS produced by infection of Macaca mulatta with SARS coronavirus.
A new SARS animal model was established by inoculating SARS coronavirus (SARS-CoV) into rhesus macaques (Macaca mulatta) through the nasal cavity. Pathological pulmonary changes were successively detected on days 5-60 after virus inoculation. All eight animals showed a transient fever 2-3 days after inoculation. Immunological, molecular biological, and pathological studies support the establishment of this SARS animal model. Firstly, SARS-CoV-specific IgGs were detected in the sera of macaques from 11 to 60 days after inoculation. Secondly, SARS-CoV RNA could be detected in pharyngeal swab samples using nested RT-PCR in all infected animals from 5 days after virus inoculation. Finally, histopathological changes of interstitial pneumonia were found in the lungs during the 60 days after viral inoculation: these changes were less marked at later time points, indicating that an active healing process together with resolution of an acute inflammatory response was taking place in these animals. This animal model should provide insight into the mechanisms of SARS-CoV-related pulmonary disease and greatly facilitate the development of vaccines and therapeutics against SARS
Visual characterization of associative quasitrivial nondecreasing operations on finite chains
In this paper we provide visual characterization of associative quasitrivial
nondecreasing operations on finite chains. We also provide a characterization
of bisymmetric quasitrivial nondecreasing binary operations on finite chains.
Finally, we estimate the number of functions belonging to the previous classes.Comment: 25 pages, 18 Figure
Prognostic value of the FUT family in acute myeloid leukemia
Genetic abnormalities are more frequently viewed as prognostic markers in acute myeloid leukemia (AML) in recent years. Fucosylation, catalyzed by fucosyltransferases (FUTs), is a post-translational modification that widely exists in cancer cells. However, the expression and clinical implication of the FUT family (FUT1-11) in AML has not been investigated. From the Cancer Genome Atlas database, a total of 155 AML patients with complete clinical characteristics and FUT1-11 expression data were included in our study. In patients who received chemotherapy alone showed that high expression levels of FUT3, FUT6, and FUT7 had adverse effects on event-free survival (EFS) and overall survival (OS) (all P <0.05), whereas high FUT4 expression had favorable effects on EFS and OS (all P <0.01). However, in the allogeneic hematopoietic stem cell transplantation (allo-HSCT) group, we only found a significant difference in EFS between the high and low FUT3 expression subgroups (P = 0.047), while other FUT members had no effect on survival. Multivariate analysis confirmed that high FUT4 expression was an independent favorable prognostic factor for both EFS (HR = 0.423, P = 0.001) and OS (HR = 0.398, P <0.001), whereas high FUT6 expression was an independent risk factor for both EFS (HR = 1.871, P = 0.017) and OS (HR = 1.729, P = 0.028) in patients who received chemotherapy alone. Moreover, we found that patients with low FUT4 and high FUT6 expressions had the shortest EFS and OS (P <0.05). Our study suggests that high expressions of FUT3/6/7 predict poor prognosis, high FUT4 expression indicates good prognosis in AML; FUT6 and FUT4 have the best prognosticating profile among them, but their effects could be neutralized by allo-HSCT
Integrated analysis of single-cell RNA-seq and bulk RNA-seq reveals RNA N6-methyladenosine modification associated with prognosis and drug resistance in acute myeloid leukemia
IntroductionAcute myeloid leukemia (AML) is a type of blood cancer that is identified by the unrestricted growth of immature myeloid cells within the bone marrow. Despite therapeutic advances, AML prognosis remains highly variable, and there is a lack of biomarkers for customizing treatment. RNA N6-methyladenosine (m6A) modification is a reversible and dynamic process that plays a critical role in cancer progression and drug resistance.MethodsTo investigate the m6A modification patterns in AML and their potential clinical significance, we used the AUCell method to describe the m6A modification activity of cells in AML patients based on 23 m6A modification enzymes and further integrated with bulk RNA-seq data.ResultsWe found that m6A modification was more effective in leukemic cells than in immune cells and induced significant changes in gene expression in leukemic cells rather than immune cells. Furthermore, network analysis revealed a correlation between transcription factor activation and the m6A modification status in leukemia cells, while active m6A-modified immune cells exhibited a higher interaction density in their gene regulatory networks. Hierarchical clustering based on m6A-related genes identified three distinct AML subtypes. The immune dysregulation subtype, characterized by RUNX1 mutation and KMT2A copy number variation, was associated with a worse prognosis and exhibited a specific gene expression pattern with high expression level of IGF2BP3 and FMR1, and low expression level of ELAVL1 and YTHDF2. Notably, patients with the immune dysregulation subtype were sensitive to immunotherapy and chemotherapy.DiscussionCollectively, our findings suggest that m6A modification could be a potential therapeutic target for AML, and the identified subtypes could guide personalized therapy
- ā¦