Search CORE

6,184 research outputs found

GaGa: A parsimonious and flexible model for differential expression analysis

Author: Rossell David
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2009
Field of study

Hierarchical models are a powerful tool for high-throughput data with a small to moderate number of replicates, as they allow sharing information across units of information, for example, genes. We propose two such models and show its increased sensitivity in microarray differential expression applications. We build on the gamma--gamma hierarchical model introduced by Kendziorski et al. [Statist. Med. 22 (2003) 3899--3914] and Newton et al. [Biostatistics 5 (2004) 155--176], by addressing important limitations that may have hampered its performance and its more widespread use. The models parsimoniously describe the expression of thousands of genes with a small number of hyper-parameters. This makes them easy to interpret and analytically tractable. The first model is a simple extension that improves the fit substantially with almost no increase in complexity. We propose a second extension that uses a mixture of gamma distributions to further improve the fit, at the expense of increased computational burden. We derive several approximations that significantly reduce the computational cost. We find that our models outperform the original formulation of the model, as well as some other popular methods for differential expression analysis. The improved performance is specially noticeable for the small sample sizes commonly encountered in high-throughput experiments. Our methods are implemented in the freely available Bioconductor gaga package.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS244 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

Crossref

Warwick Research Archives Portal Repository

Deep generative modeling for single-cell transcriptomics.

Author: A Regev
A Tanay
A Wagner
A Zeisel
AP Patel
B Wang
BK Tusi
CA Vallejos
CA Vallejos
D DeTomaso
D Grün
D Risso
DM Blei
E Pierson
FA Wolf
G Finak
G Görgün
GXY Zheng
HI Nakaya
J Ding
J Fan
Jeffrey Regier
JT Gaublomme
K Shekhar
L Haghverdi
L Held
M Stoeckius
MD Robinson
MI Love
Michael B. Cole
Michael I. Jordan
Nir Yosef
PV Kharchenko
Q Li
RE Kass
Romain Lopez
S Prabhakaran
S Semrau
U Shaham
WE Johnson
Publication venue: eScholarship, University of California
Publication date: 01/12/2018
Field of study

Single-cell transcriptome measurements can reveal unexplored biological diversity, but they suffer from technical noise and bias that must be modeled to account for the resulting uncertainty in downstream analyses. Here we introduce single-cell variational inference (scVI), a ready-to-use scalable framework for the probabilistic representation and analysis of gene expression in single cells ( https://github.com/YosefLab/scVI ). scVI uses stochastic optimization and deep neural networks to aggregate information across similar cells and genes and to approximate the distributions that underlie observed expression values, while accounting for batch effects and limited sensitivity. We used scVI for a range of fundamental analysis tasks including batch correction, visualization, clustering, and differential expression, and achieved high accuracy for each task

Crossref

eScholarship - University of California

Gamma-based clustering via ordered means with application to gene-expression analysis

Author: Chung Lisa M.
Newton Michael A.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 09/11/2012
Field of study

Discrete mixture models provide a well-known basis for effective clustering algorithms, although technical challenges have limited their scope. In the context of gene-expression data analysis, a model is presented that mixes over a finite catalog of structures, each one representing equality and inequality constraints among latent expected values. Computations depend on the probability that independent gamma-distributed variables attain each of their possible orderings. Each ordering event is equivalent to an event in independent negative-binomial random variables, and this finding guides a dynamic-programming calculation. The structuring of mixture-model components according to constraints among latent means leads to strict concavity of the mixture log likelihood. In addition to its beneficial numerical properties, the clustering method shows promising results in an empirical study.Comment: Published in at http://dx.doi.org/10.1214/10-AOS805 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref