Search CORE

1,643 research outputs found

On the Bayesian analysis of species sampling mixture models for density estimation

Author: Griffin Jim E.
Publication venue: University of Warwick. Centre for Research in Statistical Methodology
Publication date: 01/01/2006
Field of study

The mixture of normals model has been extensively applied to density estimation problems. This paper proposes an alternative parameterisation that naturally leads to new forms of prior distribution. The parameters can be interpreted as the location, scale and smoothness of the density. Priors on these parameters are often easier to specify. Alternatively, improper and default choices lead to automatic Bayesian density estimation. The ideas are extended to multivariate density estimation

Warwick Research Archives Portal Repository

The Discrete Infinite Logistic Normal Distribution

Author: Blei David
Paisley John
Wang Chong
Publication venue
Publication date: 01/01/2012
Field of study

We present the discrete infinite logistic normal distribution (DILN), a Bayesian nonparametric prior for mixed membership models. DILN is a generalization of the hierarchical Dirichlet process (HDP) that models correlation structure between the weights of the atoms at the group level. We derive a representation of DILN as a normalized collection of gamma-distributed random variables, and study its statistical properties. We consider applications to topic modeling and derive a variational inference algorithm for approximate posterior inference. We study the empirical performance of the DILN topic model on four corpora, comparing performance with the HDP and the correlated topic model (CTM). To deal with large-scale data sets, we also develop an online inference algorithm for DILN and compare with online HDP and online LDA on the Nature magazine, which contains approximately 350,000 articles.Comment: This paper will appear in Bayesian Analysis. A shorter version of this paper appeared at AISTATS 2011, Fort Lauderdale, FL, US

arXiv.org e-Print Archive

CiteSeerX

Princeton University Open Access Repository

Crossref

Model-based approach for household clustering with mixed scale variables

Author: Canale Antonio
Carmona Christian
Nieto-Barajas Luis
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 23/11/2017
Field of study

The Ministry of Social Development in Mexico is in charge of creating and assigning social programmes targeting specific needs in the population for the improvement of the quality of life. To better target the social programmes, the Ministry is aimed to find clusters of households with the same needs based on demographic characteristics as well as poverty conditions of the household. Available data consists of continuous, ordinal, and nominal variables, all of which come from a non-i.i.d complex design survey sample. We propose a Bayesian nonparametric mixture model that jointly models a set of latent variables, as in an underlying variable response approach, associated to the observed mixed scale data and accommodates for the different sampling probabilities. The performance of the model is assessed via simulated data. A full analysis of socio-economic conditions in households in the Mexican State of Mexico is presented

arXiv.org e-Print Archive

Crossref

Oxford University Research Archive

Archivio istituzionale della ricerca - Università di Padova

Identifying Mixtures of Mixtures Using Bayesian Estimation

Author: Frühwirth-Schnatter Sylvia
Grün Bettina
Malsiner-Walli Gertraud
Publication venue: 'Informa UK Limited'
Publication date: 20/06/2016
Field of study

The use of a finite mixture of normal distributions in model-based clustering allows to capture non-Gaussian data clusters. However, identifying the clusters from the normal components is challenging and in general either achieved by imposing constraints on the model or by using post-processing procedures. Within the Bayesian framework we propose a different approach based on sparse finite mixtures to achieve identifiability. We specify a hierarchical prior where the hyperparameters are carefully selected such that they are reflective of the cluster structure aimed at. In addition this prior allows to estimate the model using standard MCMC sampling methods. In combination with a post-processing approach which resolves the label switching issue and results in an identified model, our approach allows to simultaneously (1) determine the number of clusters, (2) flexibly approximate the cluster distributions in a semi-parametric way using finite mixtures of normals and (3) identify cluster-specific parameters and classify observations. The proposed approach is illustrated in two simulation studies and on benchmark data sets.Comment: 49 page

arXiv.org e-Print Archive

Elektronische Publikationen der Wirtschaftsuniversität Wien

FigShare

On nonparametric estimation of a mixing density via the predictive recursion algorithm

Author: Martin Ryan
Publication venue
Publication date: 05/12/2018
Field of study

Nonparametric estimation of a mixing density based on observations from the corresponding mixture is a challenging statistical problem. This paper surveys the literature on a fast, recursive estimator based on the predictive recursion algorithm. After introducing the algorithm and giving a few examples, I summarize the available asymptotic convergence theory, describe an important semiparametric extension, and highlight two interesting applications. I conclude with a discussion of several recent developments in this area and some open problems.Comment: 22 pages, 5 figures. Comments welcome at https://www.researchers.one/article/2018-12-

arXiv.org e-Print Archive