19,442 research outputs found

    MSIQ: Joint Modeling of Multiple RNA-seq Samples for Accurate Isoform Quantification

    Full text link
    Next-generation RNA sequencing (RNA-seq) technology has been widely used to assess full-length RNA isoform abundance in a high-throughput manner. RNA-seq data offer insight into gene expression levels and transcriptome structures, enabling us to better understand the regulation of gene expression and fundamental biological processes. Accurate isoform quantification from RNA-seq data is challenging due to the information loss in sequencing experiments. A recent accumulation of multiple RNA-seq data sets from the same tissue or cell type provides new opportunities to improve the accuracy of isoform quantification. However, existing statistical or computational methods for multiple RNA-seq samples either pool the samples into one sample or assign equal weights to the samples when estimating isoform abundance. These methods ignore the possible heterogeneity in the quality of different samples and could result in biased and unrobust estimates. In this article, we develop a method, which we call "joint modeling of multiple RNA-seq samples for accurate isoform quantification" (MSIQ), for more accurate and robust isoform quantification by integrating multiple RNA-seq samples under a Bayesian framework. Our method aims to (1) identify a consistent group of samples with homogeneous quality and (2) improve isoform quantification accuracy by jointly modeling multiple RNA-seq samples by allowing for higher weights on the consistent group. We show that MSIQ provides a consistent estimator of isoform abundance, and we demonstrate the accuracy and effectiveness of MSIQ compared with alternative methods through simulation studies on D. melanogaster genes. We justify MSIQ's advantages over existing approaches via application studies on real RNA-seq data from human embryonic stem cells, brain tissues, and the HepG2 immortalized cell line

    Differential expression analysis for sequence count data

    Get PDF
    *Motivation:* High-throughput nucleotide sequencing provides quantitative readouts in assays for RNA expression (RNA-Seq), protein-DNA binding (ChIP-Seq) or cell counting (barcode sequencing). Statistical inference of differential signal in such data requires estimation of their variability throughout the dynamic range. When the number of replicates is small, error modelling is needed to achieve statistical power.

*Results:* We propose an error model that uses the negative binomial distribution, with variance and mean linked by local regression, to model the null distribution of the count data. The method controls type-I error and provides good detection power. 

*Availability:* A free open-source R software package, _DESeq_, is available from the Bioconductor project and from "http://www-huber.embl.de/users/anders/DESeq":http://www-huber.embl.de/users/anders/DESeq

    Transcriptome analysis of cortical tissue reveals shared sets of downregulated genes in autism and schizophrenia.

    Get PDF
    Autism (AUT), schizophrenia (SCZ) and bipolar disorder (BPD) are three highly heritable neuropsychiatric conditions. Clinical similarities and genetic overlap between the three disorders have been reported; however, the causes and the downstream effects of this overlap remain elusive. By analyzing transcriptomic RNA-sequencing data generated from post-mortem cortical brain tissues from AUT, SCZ, BPD and control subjects, we have begun to characterize the extent of gene expression overlap between these disorders. We report that the AUT and SCZ transcriptomes are significantly correlated (P<0.001), whereas the other two cross-disorder comparisons (AUT-BPD and SCZ-BPD) are not. Among AUT and SCZ, we find that the genes differentially expressed across disorders are involved in neurotransmission and synapse regulation. Despite the lack of global transcriptomic overlap across all three disorders, we highlight two genes, IQSEC3 and COPS7A, which are significantly downregulated compared with controls across all three disorders, suggesting either shared etiology or compensatory changes across these neuropsychiatric conditions. Finally, we tested for enrichment of genes differentially expressed across disorders in genetic association signals in AUT, SCZ or BPD, reporting lack of signal in any of the previously published genome-wide association study (GWAS). Together, these studies highlight the importance of examining gene expression from the primary tissue involved in neuropsychiatric conditions-the cortical brain. We identify a shared role for altered neurotransmission and synapse regulation in AUT and SCZ, in addition to two genes that may more generally contribute to neurodevelopmental and neuropsychiatric conditions

    Detecting differential usage of exons from RNA-Seq data

    Get PDF
    RNA-Seq is a powerful tool for the study of alternative splicing and other forms of alternative isoform expression. Understanding the regulation of these processes requires comparisons between treatments, tissues or conditions. For the analysis of such experiments, we present _DEXSeq_, a statistical method to test for differential exon usage in RNA-Seq data. _DEXSeq_ employs generalized linear models and offers good detection power and reliable control of false discoveries by taking biological variation into account. An implementation is available as an R/Bioconductor package

    Exaggerated CpH methylation in the autism-affected brain.

    Get PDF
    BackgroundThe etiology of autism, a complex, heritable, neurodevelopmental disorder, remains largely unexplained. Given the unexplained risk and recent evidence supporting a role for epigenetic mechanisms in the development of autism, we explored the role of CpG and CpH (H = A, C, or T) methylation within the autism-affected cortical brain tissue.MethodsReduced representation bisulfite sequencing (RRBS) was completed, and analysis was carried out in 63 post-mortem cortical brain samples (Brodmann area 19) from 29 autism-affected and 34 control individuals. Analyses to identify single sites that were differentially methylated and to identify any global methylation alterations at either CpG or CpH sites throughout the genome were carried out.ResultsWe report that while no individual site or region of methylation was significantly associated with autism after multi-test correction, methylated CpH dinucleotides were markedly enriched in autism-affected brains (~2-fold enrichment at p < 0.05 cutoff, p = 0.002).ConclusionsThese results further implicate epigenetic alterations in pathobiological mechanisms that underlie autism

    Statistical modeling of RNA structure profiling experiments enables parsimonious reconstruction of structure landscapes.

    Get PDF
    RNA plays key regulatory roles in diverse cellular processes, where its functionality often derives from folding into and converting between structures. Many RNAs further rely on co-existence of alternative structures, which govern their response to cellular signals. However, characterizing heterogeneous landscapes is difficult, both experimentally and computationally. Recently, structure profiling experiments have emerged as powerful and affordable structure characterization methods, which improve computational structure prediction. To date, efforts have centered on predicting one optimal structure, with much less progress made on multiple-structure prediction. Here, we report a probabilistic modeling approach that predicts a parsimonious set of co-existing structures and estimates their abundances from structure profiling data. We demonstrate robust landscape reconstruction and quantitative insights into structural dynamics by analyzing numerous data sets. This work establishes a framework for data-directed characterization of structure landscapes to aid experimentalists in performing structure-function studies
    • …
    corecore