1,643 research outputs found
On the Bayesian analysis of species sampling mixture models for density estimation
The mixture of normals model has been extensively applied to density estimation problems.
This paper proposes an alternative parameterisation that naturally leads to new forms
of prior distribution. The parameters can be interpreted as the location, scale and smoothness
of the density. Priors on these parameters are often easier to specify. Alternatively, improper
and default choices lead to automatic Bayesian density estimation. The ideas are extended to
multivariate density estimation
The Discrete Infinite Logistic Normal Distribution
We present the discrete infinite logistic normal distribution (DILN), a
Bayesian nonparametric prior for mixed membership models. DILN is a
generalization of the hierarchical Dirichlet process (HDP) that models
correlation structure between the weights of the atoms at the group level. We
derive a representation of DILN as a normalized collection of gamma-distributed
random variables, and study its statistical properties. We consider
applications to topic modeling and derive a variational inference algorithm for
approximate posterior inference. We study the empirical performance of the DILN
topic model on four corpora, comparing performance with the HDP and the
correlated topic model (CTM). To deal with large-scale data sets, we also
develop an online inference algorithm for DILN and compare with online HDP and
online LDA on the Nature magazine, which contains approximately 350,000
articles.Comment: This paper will appear in Bayesian Analysis. A shorter version of
this paper appeared at AISTATS 2011, Fort Lauderdale, FL, US
Model-based approach for household clustering with mixed scale variables
The Ministry of Social Development in Mexico is in charge of creating and assigning social programmes targeting specific needs in the population for the improvement of the quality of life. To better target the social programmes, the Ministry is aimed to find clusters of households with the same needs based on demographic characteristics as well as poverty conditions of the household. Available data consists of continuous, ordinal, and nominal variables, all of which come from a non-i.i.d complex design survey sample. We propose a Bayesian nonparametric mixture model that jointly models a set of latent variables, as in an underlying variable response approach, associated to the observed mixed scale data and accommodates for the different sampling probabilities. The performance of the model is assessed via simulated data. A full analysis of socio-economic conditions in households in the Mexican State of Mexico is presented
Identifying Mixtures of Mixtures Using Bayesian Estimation
The use of a finite mixture of normal distributions in model-based clustering
allows to capture non-Gaussian data clusters. However, identifying the clusters
from the normal components is challenging and in general either achieved by
imposing constraints on the model or by using post-processing procedures.
Within the Bayesian framework we propose a different approach based on sparse
finite mixtures to achieve identifiability. We specify a hierarchical prior
where the hyperparameters are carefully selected such that they are reflective
of the cluster structure aimed at. In addition this prior allows to estimate
the model using standard MCMC sampling methods. In combination with a
post-processing approach which resolves the label switching issue and results
in an identified model, our approach allows to simultaneously (1) determine the
number of clusters, (2) flexibly approximate the cluster distributions in a
semi-parametric way using finite mixtures of normals and (3) identify
cluster-specific parameters and classify observations. The proposed approach is
illustrated in two simulation studies and on benchmark data sets.Comment: 49 page
On nonparametric estimation of a mixing density via the predictive recursion algorithm
Nonparametric estimation of a mixing density based on observations from the
corresponding mixture is a challenging statistical problem. This paper surveys
the literature on a fast, recursive estimator based on the predictive recursion
algorithm. After introducing the algorithm and giving a few examples, I
summarize the available asymptotic convergence theory, describe an important
semiparametric extension, and highlight two interesting applications. I
conclude with a discussion of several recent developments in this area and some
open problems.Comment: 22 pages, 5 figures. Comments welcome at
https://www.researchers.one/article/2018-12-
- âŚ