3,256 research outputs found
Particle Learning for General Mixtures
This paper develops particle learning (PL) methods for the estimation of general mixture models. The approach is distinguished from alternative particle filtering methods in two major ways. First, each iteration begins by resampling particles according to posterior predictive probability, leading to a more efficient set for propagation. Second, each particle tracks only the "essential state vector" thus leading to reduced dimensional inference. In addition, we describe how the approach will apply to more general mixture models of current interest in the literature; it is hoped that this will inspire a greater number of researchers to adopt sequential Monte Carlo methods for fitting their sophisticated mixture based models. Finally, we show that PL leads to straight forward tools for marginal likelihood calculation and posterior cluster allocation.Business Administratio
Colouring and breaking sticks: random distributions and heterogeneous clustering
We begin by reviewing some probabilistic results about the Dirichlet Process
and its close relatives, focussing on their implications for statistical
modelling and analysis. We then introduce a class of simple mixture models in
which clusters are of different `colours', with statistical characteristics
that are constant within colours, but different between colours. Thus cluster
identities are exchangeable only within colours. The basic form of our model is
a variant on the familiar Dirichlet process, and we find that much of the
standard modelling and computational machinery associated with the Dirichlet
process may be readily adapted to our generalisation. The methodology is
illustrated with an application to the partially-parametric clustering of gene
expression profiles.Comment: 26 pages, 3 figures. Chapter 13 of "Probability and Mathematical
Genetics: Papers in Honour of Sir John Kingman" (Editors N.H. Bingham and
C.M. Goldie), Cambridge University Press, 201
Bayesian Nonparametric Calibration and Combination of Predictive Distributions
We introduce a Bayesian approach to predictive density calibration and
combination that accounts for parameter uncertainty and model set
incompleteness through the use of random calibration functionals and random
combination weights. Building on the work of Ranjan, R. and Gneiting, T. (2010)
and Gneiting, T. and Ranjan, R. (2013), we use infinite beta mixtures for the
calibration. The proposed Bayesian nonparametric approach takes advantage of
the flexibility of Dirichlet process mixtures to achieve any continuous
deformation of linearly combined predictive distributions. The inference
procedure is based on Gibbs sampling and allows accounting for uncertainty in
the number of mixture components, mixture weights, and calibration parameters.
The weak posterior consistency of the Bayesian nonparametric calibration is
provided under suitable conditions for unknown true density. We study the
methodology in simulation examples with fat tails and multimodal densities and
apply it to density forecasts of daily S&P returns and daily maximum wind speed
at the Frankfurt airport.Comment: arXiv admin note: text overlap with arXiv:1305.2026 by other author
Introduction to finite mixtures
Mixture models have been around for over 150 years, as an intuitively simple
and practical tool for enriching the collection of probability distributions
available for modelling data. In this chapter we describe the basic ideas of
the subject, present several alternative representations and perspectives on
these models, and discuss some of the elements of inference about the unknowns
in the models. Our focus is on the simplest set-up, of finite mixture models,
but we discuss also how various simplifying assumptions can be relaxed to
generate the rich landscape of modelling and inference ideas traversed in the
rest of this book.Comment: 14 pages, 7 figures, A chapter prepared for the forthcoming Handbook
of Mixture Analysis. V2 corrects a small but important typographical error,
and makes other minor edits; V3 makes further minor corrections and updates
following review; V4 corrects algorithmic details in sec 4.1 and 4.2, and
removes typo
Dynamic density estimation with diffusive Dirichlet mixtures
We introduce a new class of nonparametric prior distributions on the space of
continuously varying densities, induced by Dirichlet process mixtures which
diffuse in time. These select time-indexed random functions without jumps,
whose sections are continuous or discrete distributions depending on the choice
of kernel. The construction exploits the widely used stick-breaking
representation of the Dirichlet process and induces the time dependence by
replacing the stick-breaking components with one-dimensional Wright-Fisher
diffusions. These features combine appealing properties of the model, inherited
from the Wright-Fisher diffusions and the Dirichlet mixture structure, with
great flexibility and tractability for posterior computation. The construction
can be easily extended to multi-parameter GEM marginal states, which include,
for example, the Pitman--Yor process. A full inferential strategy is detailed
and illustrated on simulated and real data.Comment: Published at http://dx.doi.org/10.3150/14-BEJ681 in the Bernoulli
(http://isi.cbs.nl/bernoulli/) by the International Statistical
Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm
Bayesian nonparametric estimation and consistency of mixed multinomial logit choice models
This paper develops nonparametric estimation for discrete choice models based
on the mixed multinomial logit (MMNL) model. It has been shown that MMNL models
encompass all discrete choice models derived under the assumption of random
utility maximization, subject to the identification of an unknown distribution
. Noting the mixture model description of the MMNL, we employ a Bayesian
nonparametric approach, using nonparametric priors on the unknown mixing
distribution , to estimate choice probabilities. We provide an important
theoretical support for the use of the proposed methodology by investigating
consistency of the posterior distribution for a general nonparametric prior on
the mixing distribution. Consistency is defined according to an -type
distance on the space of choice probabilities and is achieved by extending to a
regression model framework a recent approach to strong consistency based on the
summability of square roots of prior probabilities. Moving to estimation,
slightly different techniques for non-panel and panel data models are
discussed. For practical implementation, we describe efficient and relatively
easy-to-use blocked Gibbs sampling procedures. These procedures are based on
approximations of the random probability measure by classes of finite
stick-breaking processes. A simulation study is also performed to investigate
the performance of the proposed methods.Comment: Published in at http://dx.doi.org/10.3150/09-BEJ233 the Bernoulli
(http://isi.cbs.nl/bernoulli/) by the International Statistical
Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm
msBP: An R package to perform Bayesian nonparametric inference using multiscale Bernstein polynomials mixtures
msBP is an R package that implements a new method to perform Bayesian multiscale nonparametric inference introduced by Canale and Dunson (2016). The method, based on mixtures of multiscale beta dictionary densities, overcomes the drawbacks of Pólya trees and inherits many of the advantages of Dirichlet process mixture models. The key idea is that an infinitely-deep binary tree is introduced, with a beta dictionary density assigned to each node of the tree. Using a multiscale stick-breaking characterization, stochastically decreasing weights are assigned to each node. The result is an infinite mixture model. The package msBP implements a series of basic functions to deal with this family of priors such as random densities and numbers generation, creation and manipulation of binary tree objects, and generic functions to plot and print the results. In addition, it implements the Gibbs samplers for posterior computation to perform multiscale density estimation and multiscale testing of group differences described in Canale and Dunson (2016)
Sparse covariance estimation in heterogeneous samples
Standard Gaussian graphical models (GGMs) implicitly assume that the
conditional independence among variables is common to all observations in the
sample. However, in practice, observations are usually collected form
heterogeneous populations where such assumption is not satisfied, leading in
turn to nonlinear relationships among variables. To tackle these problems we
explore mixtures of GGMs; in particular, we consider both infinite mixture
models of GGMs and infinite hidden Markov models with GGM emission
distributions. Such models allow us to divide a heterogeneous population into
homogenous groups, with each cluster having its own conditional independence
structure. The main advantage of considering infinite mixtures is that they
allow us easily to estimate the number of number of subpopulations in the
sample. As an illustration, we study the trends in exchange rate fluctuations
in the pre-Euro era. This example demonstrates that the models are very
flexible while providing extremely interesting interesting insights into
real-life applications
- …