87,026 research outputs found
Finite mixtures of matrix-variate Poisson-log normal distributions for three-way count data
Three-way data structures, characterized by three entities, the units, the
variables and the occasions, are frequent in biological studies. In RNA
sequencing, three-way data structures are obtained when high-throughput
transcriptome sequencing data are collected for n genes across p conditions at
r occasions. Matrix-variate distributions offer a natural way to model
three-way data and mixtures of matrix-variate distributions can be used to
cluster three-way data. Clustering of gene expression data is carried out as
means to discovering gene co-expression networks. In this work, a mixture of
matrix-variate Poisson-log normal distributions is proposed for clustering read
counts from RNA sequencing. By considering the matrix-variate structure, full
information on the conditions and occasions of the RNA sequencing dataset is
simultaneously considered, and the number of covariance parameters to be
estimated is reduced. A Markov chain Monte Carlo expectation-maximization
algorithm is used for parameter estimation and information criteria are used
for model selection. The models are applied to both real and simulated data,
giving favourable clustering results
Biological significance of RNA-seq and single-cell genomic research in woody plants
RNA-seq and single-cell genomic research emerge as an important research area in the recent years due to its ability to examine genetic information of any number of single cells in all living organisms. The knowledge gained from RNA-seq and single-cell genomic research will have a great impact in many aspects of plant biology. In this review, we summary and discuss the biological significance of RNA-seq and single-cell genomic research in plants including the single-cell DNA-sequencing, RNA-seq and single-cell RNA sequencing in woody plants, methods of RNA-seq and single-cell RNA-sequencing, single-cell RNA-sequencing for studying plant development, and single-cell RNA-sequencing for elucidating cell type composition. We will focus on RNA-seq and single-cell RNA sequencing in woody plants, understanding of plant development through single-cell RNA-sequencing, and elucidation of cell type composition via single-cell RNA-sequencing. Information presented in this review will be helpful to increase our understanding of plant genomic research in a way with the power of plant single-cell RNA-sequencing analysis
Analyzing Gene Expression Profiles of a Virus and its Host During Infection
In recent years, RNA sequencing has become an important part of gene expression analysis. RNA sequencing applications are used to study many aspects of RNA structure, expression, and translation. With developing technologies, RNA sequencing is used to learn more about the biology of RNA, helping to understand more about what RNA does under different conditions, such as when the RNA’s host is under attack from a virus. New RNA sequencing technologies allow researchers to learn more about what happens to the host when there is an attack from an outside source, such as a virus. RNA sequencing also tells the researcher how the attacker successfully attacks the host and how the host responds. In this study, RNA sequencing is used to understand how a bacterial virus attacks its bacterial host
Recommended from our members
Simulating multiple faceted variability in single cell RNA sequencing.
The abundance of new computational methods for processing and interpreting transcriptomes at a single cell level raises the need for in silico platforms for evaluation and validation. Here, we present SymSim, a simulator that explicitly models the processes that give rise to data observed in single cell RNA-Seq experiments. The components of the SymSim pipeline pertain to the three primary sources of variation in single cell RNA-Seq data: noise intrinsic to the process of transcription, extrinsic variation indicative of different cell states (both discrete and continuous), and technical variation due to low sensitivity and measurement noise and bias. We demonstrate how SymSim can be used for benchmarking methods for clustering, differential expression and trajectory inference, and for examining the effects of various parameters on their performance. We also show how SymSim can be used to evaluate the number of cells required to detect a rare population under various scenarios
Quantifying alternative splicing from paired-end RNA-sequencing data
RNA-sequencing has revolutionized biomedical research and, in particular, our
ability to study gene alternative splicing. The problem has important
implications for human health, as alternative splicing may be involved in
malfunctions at the cellular level and multiple diseases. However, the
high-dimensional nature of the data and the existence of experimental biases
pose serious data analysis challenges. We find that the standard data summaries
used to study alternative splicing are severely limited, as they ignore a
substantial amount of valuable information. Current data analysis methods are
based on such summaries and are hence suboptimal. Further, they have limited
flexibility in accounting for technical biases. We propose novel data summaries
and a Bayesian modeling framework that overcome these limitations and determine
biases in a nonparametric, highly flexible manner. These summaries adapt
naturally to the rapid improvements in sequencing technology. We provide
efficient point estimates and uncertainty assessments. The approach allows to
study alternative splicing patterns for individual samples and can also be the
basis for downstream analyses. We found a severalfold improvement in estimation
mean square error compared popular approaches in simulations, and substantially
higher consistency between replicates in experimental data. Our findings
indicate the need for adjusting the routine summarization and analysis of
alternative splicing RNA-seq studies. We provide a software implementation in
the R package casper.Comment: Published in at http://dx.doi.org/10.1214/13-AOAS687 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org). With correction
Sashimi plots: Quantitative visualization of RNA sequencing read alignments
We introduce Sashimi plots, a quantitative multi-sample visualization of mRNA
sequencing reads aligned to gene annotations. Sashimi plots are made using
alignments (stored in the SAM/BAM format) and gene model annotations (in GFF
format), which can be custom-made by the user or obtained from databases such
as Ensembl or UCSC. We describe two implementations of Sashimi plots: (1) a
stand-alone command line implementation aimed at making customizable
publication quality figures, and (2) an implementation built into the
Integrated Genome Viewer (IGV) browser, which enables rapid and dynamic
creation of Sashimi plots for any genomic region of interest, suitable for
exploratory analysis of alternatively spliced regions of the transcriptome.
Isoform expression estimates outputted by the MISO program can be optionally
plotted along with Sashimi plots. Sashimi plots can be used to quickly screen
differentially spliced exons along genomic regions of interest and can be used
in publication quality figures. The Sashimi plot software and documentation is
available from: http://genes.mit.edu/burgelab/miso/docs/sashimi.htmlComment: 2 figure
Application of whole genome and RNA sequencing to investigate the genomic landscape of common variable immunodeficiency disorders.
Common Variable Immunodeficiency Disorders (CVIDs) are the most prevalent cause of primary antibody failure. CVIDs are highly variable and a genetic causes have been identified in <5% of patients. Here, we performed whole genome sequencing (WGS) of 34 CVID patients (94% sporadic) and combined them with transcriptomic profiling (RNA-sequencing of B cells) from three patients and three healthy controls. We identified variants in CVID disease genes TNFRSF13B, TNFRSF13C, LRBA and NLRP12 and enrichment of variants in known and novel disease pathways. The pathways identified include B-cell receptor signalling, non-homologous end-joining, regulation of apoptosis, T cell regulation and ICOS signalling. Our data confirm the polygenic nature of CVID and suggest individual-specific aetiologies in many cases. Together our data show that WGS in combination with RNA-sequencing allows for a better understanding of CVIDs and the identification of novel disease associated pathways
Recommended from our members
Complete Genome Sequence of a Divergent Human Rhinovirus C Isolate from an Infant with Severe Community-Acquired Pneumonia in Colorado, USA.
Here, we report the genome sequence of a divergent human rhinovirus C isolate identified from an infant with a severe community-acquired respiratory infection. RNA sequencing performed on an Illumina platform identified reads aligning to human rhinovirus species, which were de novo assembled to produce a coding-complete genome sequence
Nanopore direct RNA sequencing maps the complexity of Arabidopsis mRNA processing and m6A modification
Understanding genome organization and gene regulation requires insight into RNA transcription, processing and modification. We adapted nanopore direct RNA sequencing to examine RNA from a wild-type accession of the model plant Arabidopsis thaliana and a mutant defective in mRNA methylation (m6A). Here we show that m6A can be mapped in full-length mRNAs transcriptome-wide and reveal the combinatorial diversity of cap-associated transcription start sites, splicing events, poly(A) site choice and poly(A) tail length. Loss of m6A from 3’ untranslated regions is associated with decreased relative transcript abundance and defective RNA 30 end formation. A functional consequence of disrupted m6A is a lengthening of the circadian period. We conclude that nanopore direct RNA sequencing can reveal the complexity of mRNA processing and modification in full-length single molecule reads. These findings can refine Arabidopsis genome annotation. Further, applying this approach to less well-studied species could transform our understanding of what their genomes encode
- …