3,062 research outputs found
Recommended from our members
Genomics analysis on the responses of E. coli cells to varying environmental conditions
The natural living environments of E. coli cells are diverse, varying from
mammalian gastrointestinal tracts and soil. Each environment might require
distinct metabolic pathways and transporter systems, and long-term evolution
has established elaborate regulatory system for E. coli cells to quickly adapt to
the changing conditions. Sensing outside stresses and then adopting a different
phenotype enable them to take advantage of any possible nutrients and defend
against hostile environment. A lot of regulatory mechanisms have been identified
by genetic, biochemical and molecular biology methods, and our study aim to
build a systematic view on the response of the whole genome to four different
environmental conditions. We used statistical tests including Pearson’s tests and
Spearman’s tests and multiple testing adjustments to identify feature genes that
are induced or repressed significantly across treatment levels. The feature genes
identified were partially supported by previous literatures, and some of the novel
genes not found in any previous studies may infer a potential research blind spot.
Additionally, we compared the correlation tests to the implementation of machine
learning algorithms, and discussed the advantage and drawbacks of each
method.Statistic
Deep generative modeling for single-cell transcriptomics.
Single-cell transcriptome measurements can reveal unexplored biological diversity, but they suffer from technical noise and bias that must be modeled to account for the resulting uncertainty in downstream analyses. Here we introduce single-cell variational inference (scVI), a ready-to-use scalable framework for the probabilistic representation and analysis of gene expression in single cells ( https://github.com/YosefLab/scVI ). scVI uses stochastic optimization and deep neural networks to aggregate information across similar cells and genes and to approximate the distributions that underlie observed expression values, while accounting for batch effects and limited sensitivity. We used scVI for a range of fundamental analysis tasks including batch correction, visualization, clustering, and differential expression, and achieved high accuracy for each task
Statistical Methods For Whole Transcriptome Sequencing: From Bulk Tissue To Single Cells
RNA-Sequencing (RNA-Seq) has enabled detailed unbiased profiling of whole transcriptomes with incredible throughput. Recent technological breakthroughs have pushed back the frontiers of RNA expression measurement to single-cell level (scRNA-Seq). With both bulk and single-cell RNA-Seq analyses, modeling of the noise structure embedded in the data is crucial for draw- ing correct inference. In this dissertation, I developed a series of statistical methods to account for the technical variations specific in RNA-Seq experiments in the context of isoform- or gene- level differential expression analyses. In the first part of my dissertation, I developed MetaDiff (https://github.com/jiach/MetaDiff), a random-effects meta-regression model, that allows the incorporation of uncertainty in isoform expression estimation in isoform differential expression anal- ysis. This framework was further extended to detect splicing quantitative trait loci with RNA-Seq data. In the second part of my dissertation, I developed TASC (Toolkit for Analysis of Single-Cell data; https://github.com/scrna-seq/TASC), a hierarchical mixture model, to explicitly adjust for cell-to-cell technical differences in scRNA-Seq analysis using an empirical Bayes approach. This framework can be adapted to perform differential gene expression analysis. In the third part of my dissertation, I developed, TASC-B, a method extended from TASC to model transcriptional bursting- induced zero-inflation. This model can identify and test for the difference in the level of transcrip- tional bursting. Compared to existing methods, these new tools that I developed have been shown to better control the false discovery rate in situations where technical noise cannot be ignored. They also display superior power in both our simulation studies and real world applications
- …