2,557 research outputs found
Statistical modeling of isoform splicing dynamics from RNA-seq time series data
Isoform quantification is an important goal of RNA-seq experiments, yet it
remains prob- lematic for genes with low expression or several isoforms. These
difficulties may in principle be ameliorated by exploiting correlated
experimental designs, such as time series or dosage response experiments. Time
series RNA-seq experiments, in particular, are becoming in- creasingly popular,
yet there are no methods that explicitly leverage the experimental design to
improve isoform quantification. Here we present DICEseq, the first isoform
quantification method tailored to correlated RNA-seq experiments. DICEseq
explicitly models the corre- lations between different RNA-seq experiments to
aid the quantification of isoforms across experiments. Numerical experiments on
simulated data sets show that DICEseq yields more accurate results than
state-of-the-art methods, an advantage that can become considerable at low
coverage levels. On real data sets, our results show that DICEseq provides
substan- tially more reproducible and robust quantifications, increasing the
correlation of estimates from replicate data sets by up to 10% on genes with
low or moderate expression levels (bot- tom third of all genes). Furthermore,
DICEseq permits to quantify the trade-off between temporal sampling of RNA and
depth of sequencing, frequently an important choice when planning experiments.
Our results have strong implications for the design of RNA-seq ex- periments,
and offer a novel tool for improved analysis of such data sets. Python code is
freely available at http://diceseq.sf.net
Bayesian Modeling Approaches for Temporal Dynamics in RNA-seq Data
Analysis of differential expression has been a central role to address the variety of biological questions in the manner to characterize abnormal patterns of cellular and molecular functions for last decades. To date, identification of differentially expressed genes and isoforms has been more intensively focused on temporal dynamics over a series of time points. Bayesian strategies have been successfully employed to uncover the complexity of biological interest with the methodological and analytical perspectives for the various platforms of high-throughput data, for instance, methods in differential expression analysis and network modules in transcriptome data, peak-callers in ChipSeq data, target prediction in microRNA data and meta-methods between different platforms. In this chapter, we will discuss how our methodological works based on Bayesian models address important questions to arise in the architecture of temporal dynamics in RNA-seq data
Modeling and analysis of RNA-seq data: a review from a statistical perspective
Background: Since the invention of next-generation RNA sequencing (RNA-seq)
technologies, they have become a powerful tool to study the presence and
quantity of RNA molecules in biological samples and have revolutionized
transcriptomic studies. The analysis of RNA-seq data at four different levels
(samples, genes, transcripts, and exons) involve multiple statistical and
computational questions, some of which remain challenging up to date.
Results: We review RNA-seq analysis tools at the sample, gene, transcript,
and exon levels from a statistical perspective. We also highlight the
biological and statistical questions of most practical considerations.
Conclusion: The development of statistical and computational methods for
analyzing RNA- seq data has made significant advances in the past decade.
However, methods developed to answer the same biological question often rely on
diverse statical models and exhibit different performance under different
scenarios. This review discusses and compares multiple commonly used
statistical models regarding their assumptions, in the hope of helping users
select appropriate methods as needed, as well as assisting developers for
future method development
Statistical Methods For Whole Transcriptome Sequencing: From Bulk Tissue To Single Cells
RNA-Sequencing (RNA-Seq) has enabled detailed unbiased profiling of whole transcriptomes with incredible throughput. Recent technological breakthroughs have pushed back the frontiers of RNA expression measurement to single-cell level (scRNA-Seq). With both bulk and single-cell RNA-Seq analyses, modeling of the noise structure embedded in the data is crucial for draw- ing correct inference. In this dissertation, I developed a series of statistical methods to account for the technical variations specific in RNA-Seq experiments in the context of isoform- or gene- level differential expression analyses. In the first part of my dissertation, I developed MetaDiff (https://github.com/jiach/MetaDiff), a random-effects meta-regression model, that allows the incorporation of uncertainty in isoform expression estimation in isoform differential expression anal- ysis. This framework was further extended to detect splicing quantitative trait loci with RNA-Seq data. In the second part of my dissertation, I developed TASC (Toolkit for Analysis of Single-Cell data; https://github.com/scrna-seq/TASC), a hierarchical mixture model, to explicitly adjust for cell-to-cell technical differences in scRNA-Seq analysis using an empirical Bayes approach. This framework can be adapted to perform differential gene expression analysis. In the third part of my dissertation, I developed, TASC-B, a method extended from TASC to model transcriptional bursting- induced zero-inflation. This model can identify and test for the difference in the level of transcrip- tional bursting. Compared to existing methods, these new tools that I developed have been shown to better control the false discovery rate in situations where technical noise cannot be ignored. They also display superior power in both our simulation studies and real world applications
Poly(A) Tail Regulation in the Nucleus
Der Ribonukleinsäure (RNS) Stoffwechsel umfasst verschiedene Schritte, beginnend mit der Transkription der RNS über die Translation bis zum RNA Abbau. Poly(A) Schwänze befinden sich am Ende der meisten der Boten-RNS, schützen die RNA vor Abbau und stimulieren Translation. Die Deadenylierung von Poly(A) Schwänzen limitiert den Abbau von RNS. Bisher wurde RNS Abbau meist im Kontext von cytoplasmatischen Prozessen untersucht, ob und wie RNS Deadenylierung und Abbau in Nukleus erfolgen ist bisher unklar.
Es wurde daher eine neue Methode zur genomweiten Bestimmung von Poly(A) Schwanzlänge entwickelt, welche FLAM-Seq genannt wurde. FLAM-Seq wurde verwendet um Zelllinien, Organoide und C. elegans RNS zu analysieren und es wurde eine signifikante Korrelation zwischen 3’-UTR und Poly(A) Länge gefunden, sowie für viele Gene ein Zusammenhang von alternativen 3‘-UTR Isoformen und Poly(A) Länge.
Die Untersuchung von Poly(A) Schwänzen von nicht-gespleißten RNS Molekülen zeige, dass deren Poly(A) Schwänze eine Länge von mehr als 200 nt hatten. Die Analyse wurde durch eine Inhibition des Spleiß-Prozesses validiert. Die Verwendung von Methoden zur Markierung von RNS, welche die zeitliche Auflösung der RNS Prozessierung ermöglicht, deutete auf eine Deadenylierung der Poly(A) Schwänze schon wenige Minuten nach deren Synthesis hin. Die Analyse von subzellulären Fraktionen zeigte, dass diese initiale Deadenylierung ein Prozess im Nukleus ist. Dieser Prozess ist gen-spezifisch und Poly(A) Schwänze von bestimmten Typen von Transkripten, wie nuklearen langen nicht-kodierende RNS Molekülen waren nicht deadenyliert.
Um Enzyme zu identifizieren, welche die Deadenylierung im Zellkern katalysieren, wurden verschiedene Methoden wie RNS-abbauende Cas Systeme, siRNAs oder shRNA Zelllinien verwendet. Trotz einer effizienten Reduktion der RNS Expression entsprechender Enzymkomplexe konnten keine molekularen Phänotypen identifiziert werden welche die Poly(A) Länge im Zellkern beeinflussen.The RNA metabolism involves different steps from transcription to translation and decay of messenger RNAs (mRNAs). Most mRNAs have a poly(A) tail attached to their 3’-end, which protects them from degradation and stimulates translation. Removal of the poly(A) tail is the rate-limiting step in RNA decay controlling stability and translation. It is yet unclear if and to what extent RNA deadenylation occurs in the mammalian nucleus.
A novel method for genome-wide determination of poly(A) tail length, termed FLAM-Seq, was developed to overcome current challenges in sequencing mRNAs, enabling genome-wide analysis of complete RNAs, including their poly(A) tail sequence. FLAM-Seq analysis of different model systems uncovered a strong correlation between poly(A) tail and 3’-UTR length or alternative polyadenylation. Cytosine nucleotides were further significantly enriched in poly(A) tails. Analyzing poly(A) tails of unspliced RNAs from FLAM-Seq data revealed the genome-wide synthesis of poly(A) tails with a length of more than 200 nt. This could be validated by splicing inhibition experiments which uncovered potential links between the completion of splicing and poly(A) tail shortening. Measuring RNA deadenylation kinetics using metabolic labeling experiments hinted at a rapid shortening of tails within minutes. The analysis of subcellular fractions obtained from HeLa cells and a mouse brain showed that initial deadenylation is a nuclear process. Nuclear deadenylation is gene specific and poly(A) tails of lncRNAs retained in the nucleus were not shortened. To identify enzymes responsible for nuclear deadenylation, RNA targeting Cas-systems, siRNAs and shRNA cell lines were used to different deadenylase complexes. Despite efficient mRNA knockdown, subcellular analysis of poly(A) tail length by did not yield molecular phenotypes of changing nuclear poly(A) tail length
Recommended from our members
Regulation of splicing networks in neurodevelopment
Alternative splicing of pre-mRNA is a critical mechanism for enabling genetic diversity, and is a carefully regulated process in neuronal differentiation. RNA binding proteins (RBPs) are developmentally expressed and physically interact with RNA to drive specific splicing changes. This work tests the hypothesis that RBP-RNA interactions are critical for regulating timed and coordinated alternative splicing changes during neurodevelopment and that these splicing changes are in turn part of major regulatory mechanisms that underlie morphological and functional maturation of neurons. I describe our efforts to identify functional RBP-RNA interactions, including the identification of previously unobserved splicing events, and explore the combinatorial roles of multiple brain-specific RBPs during development. Using integrative modeling that combines multiple sources of data, we find hundreds of regulated splicing events for each of RBFOX, NOVA, PTBP, and MBNL. In the neurodevelopmental context, we find that the proteins control different sets of exons, with RBFOX, NOVA, and PTBP regulating early splicing changes and MBNL largely regulating later splicing changes. These findings additionally led to the observation that CNS and sensory neurons express a variety of different RBP programs, with many sensory neurons expressing a less mature splicing pattern than CNS neurons. We also establish a foundation for further exploration of neurodevelopmental splicing, by investigating the regulation of previously unobserved splicing events
A survey of best practices for RNA-seq data analysis.
RNA-sequencing (RNA-seq) has a wide variety of applications, but no single analysis pipeline can be used in all cases. We review all of the major steps in RNA-seq data analysis, including experimental design, quality control, read alignment, quantification of gene and transcript levels, visualization, differential gene expression, alternative splicing, functional analysis, gene fusion detection and eQTL mapping. We highlight the challenges associated with each step. We discuss the analysis of small RNAs and the integration of RNA-seq with other functional genomics techniques. Finally, we discuss the outlook for novel technologies that are changing the state of the art in transcriptomics.This is the final published version. It first appeared at http://genomebiology.biomedcentral.com/articles/10.1186/s13059-016-0881-8
- …