Search CORE

266 research outputs found

Fast MCMC sampling for Markov jump processes and extensions

Author: Rao Vinayak
Teh Yee Whye
Publication venue
Publication date: 01/01/2013
Field of study

Markov jump processes (or continuous-time Markov chains) are a simple and important class of continuous-time dynamical systems. In this paper, we tackle the problem of simulating from the posterior distribution over paths in these models, given partial and noisy observations. Our approach is an auxiliary variable Gibbs sampler, and is based on the idea of uniformization. This sets up a Markov chain over paths by alternately sampling a finite set of virtual jump times given the current path and then sampling a new path given the set of extant and virtual jump times using a standard hidden Markov model forward filtering-backward sampling algorithm. Our method is exact and does not involve approximations like time-discretization. We demonstrate how our sampler extends naturally to MJP-based models like Markov-modulated Poisson processes and continuous-time Bayesian networks and show significant computational benefits over state-of-the-art MCMC samplers for these models.Comment: Accepted at the Journal of Machine Learning Research (JMLR

arXiv.org e-Print Archive

Oxford University Research Archive

Bayesian nonparametric models for ranked data

Author: Caron Francois
Teh Yee Whye
Publication venue
Publication date: 19/11/2012
Field of study

We develop a Bayesian nonparametric extension of the popular Plackett-Luce choice model that can handle an infinite number of choice items. Our framework is based on the theory of random atomic measures, with the prior specified by a gamma process. We derive a posterior characterization and a simple and effective Gibbs sampler for posterior simulation. We develop a time-varying extension of our model, and apply it to the New York Times lists of weekly bestselling books.Comment: NIPS - Neural Information Processing Systems (2012

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Oxford University Research Archive

Oskar Bordeaux

HAL-Rennes 1

A nonparametric HMM for genetic imputation and coalescent inference

Author: Elliott Lloyd T.
Teh Yee Whye
Publication venue
Publication date: 01/01/2016
Field of study

Genetic sequence data are well described by hidden Markov models (HMMs) in which latent states correspond to clusters of similar mutation patterns. Theory from statistical genetics suggests that these HMMs are nonhomogeneous (their transition probabilities vary along the chromosome) and have large support for self transitions. We develop a new nonparametric model of genetic sequence data, based on the hierarchical Dirichlet process, which supports these self transitions and nonhomogeneity. Our model provides a parameterization of the genetic process that is more parsimonious than other more general nonparametric models which have previously been applied to population genetics. We provide truncation-free MCMC inference for our model using a new auxiliary sampling scheme for Bayesian nonparametric HMMs. In a series of experiments on male X chromosome data from the Thousand Genomes Project and also on data simulated from a population bottleneck we show the benefits of our model over the popular finite model fastPHASE, which can itself be seen as a parametric truncation of our model. We find that the number of HMM states found by our model is correlated with the time to the most recent common ancestor in population bottlenecks. This work demonstrates the flexibility of Bayesian nonparametrics applied to large and complex genetic data

arXiv.org e-Print Archive

Crossref

Oxford University Research Archive

A hybrid sampler for Poisson-Kingman mixture models

Author: Favaro Stefano
Lomeli Maria
Teh Yee Whye
Publication venue
Publication date: 01/01/2015
Field of study

This paper concerns the introduction of a new Markov Chain Monte Carlo scheme for posterior sampling in Bayesian nonparametric mixture models with priors that belong to the general Poisson-Kingman class. We present a novel compact way of representing the infinite dimensional component of the model such that while explicitly representing this infinite component it has less memory and storage requirements than previous MCMC schemes. We describe comparative simulation results demonstrating the efficacy of the proposed MCMC algorithm against existing marginal and conditional MCMC samplers

arXiv.org e-Print Archive

UCL Discovery

Institutional Research Information System University of Turin

An Exact Auxiliary Variable Gibbs Sampler for a Class of Diffusions

Author: Rao Vinayak
Teh Yee Whye
Wang Qi
Publication venue
Publication date: 01/01/2020
Field of study

Stochastic differential equations (SDEs) or diffusions are continuous-valued continuous-time stochastic processes widely used in the applied and mathematical sciences. Simulating paths from these processes is usually an intractable problem, and typically involves time-discretization approximations. We propose an exact Markov chain Monte Carlo sampling algorithm that involves no such time-discretization error. Our sampler is applicable to the problem of prior simulation from an SDE, posterior simulation conditioned on noisy observations, as well as parameter inference given noisy observations. Our work recasts an existing rejection sampling algorithm for a class of diffusions as a latent variable model, and then derives an auxiliary variable Gibbs sampling algorithm that targets the associated joint distribution. At a high level, the resulting algorithm involves two steps: simulating a random grid of times from an inhomogeneous Poisson process, and updating the SDE trajectory conditioned on this grid. Our work allows the vast literature of Monte Carlo sampling algorithms from the Gaussian process literature to be brought to bear to applications involving diffusions. We study our method on synthetic and real datasets, where we demonstrate superior performance over competing methods.Comment: 37 pages, 13 figure

arXiv.org e-Print Archive

Oxford University Research Archive

Rediscovery of Good-Turing estimators via Bayesian nonparametrics

Author: Favaro Stefano
Nipoti Bernardo
Teh Yee Whye
Publication venue
Publication date: 16/06/2015
Field of study

The problem of estimating discovery probabilities originated in the context of statistical ecology, and in recent years it has become popular due to its frequent appearance in challenging applications arising in genetics, bioinformatics, linguistics, designs of experiments, machine learning, etc. A full range of statistical approaches, parametric and nonparametric as well as frequentist and Bayesian, has been proposed for estimating discovery probabilities. In this paper we investigate the relationships between the celebrated Good-Turing approach, which is a frequentist nonparametric approach developed in the 1940s, and a Bayesian nonparametric approach recently introduced in the literature. Specifically, under the assumption of a two parameter Poisson-Dirichlet prior, we show that Bayesian nonparametric estimators of discovery probabilities are asymptotically equivalent, for a large sample size, to suitably smoothed Good-Turing estimators. As a by-product of this result, we introduce and investigate a methodology for deriving exact and asymptotic credible intervals to be associated with the Bayesian nonparametric estimators of discovery probabilities. The proposed methodology is illustrated through a comprehensive simulation study and the analysis of Expressed Sequence Tags data generated by sequencing a benchmark complementary DNA library

arXiv.org e-Print Archive

CiteSeerX

Institutional Research Information System University of Turin