Search CORE

248 research outputs found

Modeling Persistent Trends in Distributions

Author: Gifford David
Jaakkola Tommi
Mueller Jonas
Publication venue: 'Informa UK Limited'
Publication date: 24/05/2017
Field of study

We present a nonparametric framework to model a short sequence of probability distributions that vary both due to underlying effects of sequential progression and confounding noise. To distinguish between these two types of variation and estimate the sequential-progression effects, our approach leverages an assumption that these effects follow a persistent trend. This work is motivated by the recent rise of single-cell RNA-sequencing experiments over a brief time course, which aim to identify genes relevant to the progression of a particular biological process across diverse cell populations. While classical statistical tools focus on scalar-response regression or order-agnostic differences between distributions, it is desirable in this setting to consider both the full distributions as well as the structure imposed by their ordering. We introduce a new regression model for ordinal covariates where responses are univariate distributions and the underlying relationship reflects consistent changes in the distributions over increasing levels of the covariate. This concept is formalized as a "trend" in distributions, which we define as an evolution that is linear under the Wasserstein metric. Implemented via a fast alternating projections algorithm, our method exhibits numerous strengths in simulations and analyses of single-cell gene expression data.Comment: To appear in: Journal of the American Statistical Associatio

arXiv.org e-Print Archive

DSpace@MIT

FigShare

Stochastic Variational Inference

Author: Chong Wang
David M. Blei
John Paisley
Matthew D. Hoffman
Tommi Jaakkola
Publication venue
Publication date: 01/01/2013
Field of study

We develop stochastic variational inference, a scalable algorithm for approximating posterior distributions. We develop this technique for a large class of probabilistic models and we demonstrate it with two probabilistic topic models, latent Dirichlet allocation and the hierarchical Dirichlet process topic model. Using stochastic variational inference, we analyze several large collections of documents: 300K articles from Nature, 1.8M articles from The New York Times, and 3.8M articles from Wikipedia. Stochastic inference can easily handle data sets of this size and outperforms traditional variational inference, which can only handle a smaller subset. (We also show that the Bayesian nonparametric topic model outperforms its parametric counterpart.) Stochastic variational inference lets us apply complex Bayesian models to massive data sets

arXiv.org e-Print Archive

CiteSeerX

Princeton University Open Access Repository

Tree block coordinate descent for map in graphical models

Author: Jaakkola Tommi S.
Sontag David Alexander
Publication venue: Journal of Machine Learning Research
Publication date: 01/01/2009
Field of study

abstract URL: http://jmlr.csail.mit.edu/proceedings/papers/v5/sontag09a.htmlA number of linear programming relaxations have been proposed for finding most likely settings of the variables (MAP) in large probabilistic models. The relaxations are often succinctly expressed in the dual and reduce to different types of reparameterizations of the original model. The dual objectives are typically solved by performing local block coordinate descent steps. In this work, we show how to perform block coordinate descent on spanning trees of the graphical model. We also show how all of the earlier dual algorithms are related to each other, giving transformations from one type of reparameterization to another while maintaining monotonicity relative to a common objective function. Finally, we quantify when the MAP solution can and cannot be decoded directly from the dual LP relaxation

CiteSeerX

DSpace@MIT