71,848 research outputs found

    Advancing transcriptome platforms

    Get PDF
    During the last decade of years, remarkable technological innovations have emerged that allow the direct or indirect determination of the transcriptome at unprecedented scale and speed. Studies using these methods have already altered our view of the extent and complexity of transcript profiling, which has advanced from one-gene-at-a-time to a holistic view of the genome. Here, we outline the major technical advances in transcriptome characterization, including the most popular used hybridization-based platform, the well accepted tag-based sequencing platform, and the recently developed RNA-Seq (RNA sequencing) based platform. Importantly, these next-generation technologies revolutionize assessing the entire transcriptome via the recent RNA-Seq technology

    Impact of Gene Annotation on RNA-seq Data Analysis

    Get PDF
    RNA-seq has become increasingly popular in transcriptome profiling. One of the major challenges in RNA-seq data analysis is the accurate mapping of junction reads to their genomic origins. To detect splicing sites in short reads, many RNA-seq aligners use reference transcriptome to inform placement of junction reads. However, no systematic evaluation has been performed to assess or quantify the benefits of incorporating reference transcriptome in mapping RNA-seq reads. Meanwhile, there exist multiple human genome annotation databases, including RefGene (RefSeq Gene), Ensembl, and the UCSC annotation database. The impact of the choice of an annotation on estimating gene expression remains insufficiently investigated

    Optimization Techniques For Next-Generation Sequencing Data Analysis

    Get PDF
    High-throughput RNA sequencing (RNA-Seq) is a popular cost-efficient technology with many medical and biological applications. This technology, however, presents a number of computational challenges in reconstructing full-length transcripts and accurately estimate their abundances across all cell types. Our contributions include (1) transcript and gene expression level estimation methods, (2) methods for genome-guided and annotation-guided transcriptome reconstruction, and (3) de novo assembly and annotation of real data sets. Transcript expression level estimation, also referred to as transcriptome quantification, tackle the problem of estimating the expression level of each transcript. Transcriptome quantification analysis is crucial to determine similar transcripts or unraveling gene functions and transcription regulation mechanisms. We propose a novel simulated regression based method for transcriptome frequency estimation from RNA-Seq reads. Transcriptome reconstruction refers to the problem of reconstructing the transcript sequences from the RNA-Seq data. We present genome-guided and annotation-guided transcriptome reconstruction methods. Empirical results on both synthetic and real RNA-seq datasets show that the proposed methods improve transcriptome quantification and reconstruction accuracy compared to currently state of the art methods. We further present the assembly and annotation of Bugula neritina transcriptome (a marine colonial animal), and Tallapoosa darter genome (a species-rich radiation freshwater fish)

    RNA-seq transcriptional profiling of peripheral blood leukocytes from cattle infected with Mycobacterium bovis

    Get PDF
    Bovine tuberculosis, caused by infection with Mycobacterium bovis, is a major endemic disease affecting cattle populations worldwide, despite the implementation of stringent surveillance and control programs in many countries. The development of high-throughput functional genomics technologies, including gene expression microarrays and RNA-sequencing (RNA-seq), has enabled detailed analysis of the host transcriptome to M. bovis infection, particularly at the macrophage and peripheral blood level. In the present study, we have analyzed the peripheral blood leukocyte (PBL) transcriptome of eight natural M. bovis-infected and eight age- and sex-matched non-infected control Holstein-Friesian animals using RNA-seq. In addition, we compared gene expression profiles generated using RNA-seq with those previously generated using the high-density Affymetrix(®) GeneChip(®) Bovine Genome Array platform from the same PBL-extracted RNA. A total of 3,250 differentially expressed (DE) annotated genes were detected in the M. bovis-infected samples relative to the controls (adjusted P-value ≤0.05), with the number of genes displaying decreased relative expression (1,671) exceeding those with increased relative expression (1,579). Ingenuity(®) Systems Pathway Analysis (IPA) of all DE genes revealed enrichment for genes with immune function. Notably, transcriptional suppression was observed among several of the top-ranking canonical pathways including Leukocyte Extravasation Signaling. Comparative platform analysis demonstrated that RNA-seq detected a larger number of annotated DE genes (3,250) relative to the microarray (1,398), of which 917 genes were common to both technologies and displayed the same direction of expression. Finally, we show that RNA-seq had an increased dynamic range compared to the microarray for estimating differential gene expression

    High-throughput transcriptomics

    Get PDF
    High-throughput transcriptomics has revolutionised the field of transcriptome research by offering a cost-effective and powerful screening tool. Standard bulk RNA sequencing (RNA-Seq) enables characterisation of the average expression profiles for individual samples and facilitates identification of the molecular functions associated with genes differentially expressed across conditions. RNA-Seq can also be applied to disentangle splicing variants and discover novel transcripts, thus contributing to a comprehensive understanding of the transcriptome landscape. A closely related technique, single-cell RNA-Seq, has enabled the study of cell-type-specific gene expressions in hundreds to thousands of cells, aiding the exploration of cell heterogeneity. Nowadays, bulk RNA-Seq and single-cell RNA-Seq serve as complementary tools to advance and accelerate the development of transcriptome-based resources. This Collection illustrates how the current global research community makes use of these techniques to address a broad range of questions in life sciences. It demonstrates the usefulness and popularity of high-throughput transcriptomics and presents the best practices and potential issues for the benefit of future end-users

    Polymorphism identification and improved genome annotation of Brassica rapa through Deep RNA sequencing.

    Get PDF
    The mapping and functional analysis of quantitative traits in Brassica rapa can be greatly improved with the availability of physically positioned, gene-based genetic markers and accurate genome annotation. In this study, deep transcriptome RNA sequencing (RNA-Seq) of Brassica rapa was undertaken with two objectives: SNP detection and improved transcriptome annotation. We performed SNP detection on two varieties that are parents of a mapping population to aid in development of a marker system for this population and subsequent development of high-resolution genetic map. An improved Brassica rapa transcriptome was constructed to detect novel transcripts and to improve the current genome annotation. This is useful for accurate mRNA abundance and detection of expression QTL (eQTLs) in mapping populations. Deep RNA-Seq of two Brassica rapa genotypes-R500 (var. trilocularis, Yellow Sarson) and IMB211 (a rapid cycling variety)-using eight different tissues (root, internode, leaf, petiole, apical meristem, floral meristem, silique, and seedling) grown across three different environments (growth chamber, greenhouse and field) and under two different treatments (simulated sun and simulated shade) generated 2.3 billion high-quality Illumina reads. A total of 330,995 SNPs were identified in transcribed regions between the two genotypes with an average frequency of one SNP in every 200 bases. The deep RNA-Seq reassembled Brassica rapa transcriptome identified 44,239 protein-coding genes. Compared with current gene models of B. rapa, we detected 3537 novel transcripts, 23,754 gene models had structural modifications, and 3655 annotated proteins changed. Gaps in the current genome assembly of B. rapa are highlighted by our identification of 780 unmapped transcripts. All the SNPs, annotations, and predicted transcripts can be viewed at http://phytonetworks.ucdavis.edu/

    MSIQ: Joint Modeling of Multiple RNA-seq Samples for Accurate Isoform Quantification

    Full text link
    Next-generation RNA sequencing (RNA-seq) technology has been widely used to assess full-length RNA isoform abundance in a high-throughput manner. RNA-seq data offer insight into gene expression levels and transcriptome structures, enabling us to better understand the regulation of gene expression and fundamental biological processes. Accurate isoform quantification from RNA-seq data is challenging due to the information loss in sequencing experiments. A recent accumulation of multiple RNA-seq data sets from the same tissue or cell type provides new opportunities to improve the accuracy of isoform quantification. However, existing statistical or computational methods for multiple RNA-seq samples either pool the samples into one sample or assign equal weights to the samples when estimating isoform abundance. These methods ignore the possible heterogeneity in the quality of different samples and could result in biased and unrobust estimates. In this article, we develop a method, which we call "joint modeling of multiple RNA-seq samples for accurate isoform quantification" (MSIQ), for more accurate and robust isoform quantification by integrating multiple RNA-seq samples under a Bayesian framework. Our method aims to (1) identify a consistent group of samples with homogeneous quality and (2) improve isoform quantification accuracy by jointly modeling multiple RNA-seq samples by allowing for higher weights on the consistent group. We show that MSIQ provides a consistent estimator of isoform abundance, and we demonstrate the accuracy and effectiveness of MSIQ compared with alternative methods through simulation studies on D. melanogaster genes. We justify MSIQ's advantages over existing approaches via application studies on real RNA-seq data from human embryonic stem cells, brain tissues, and the HepG2 immortalized cell line
    • …