research

Relabelling Algorithms for Large Dataset Mixture Models

Abstract

Mixture models are flexible tools in density estimation and classification problems. Bayesian estimation of such models typically relies on sampling from the posterior distribution using Markov chain Monte Carlo. Label switching arises because the posterior is invariant to permutations of the component parameters. Methods for dealing with label switching have been studied fairly extensively in the literature, with the most popular approaches being those based on loss functions. However, many of these algorithms turn out to be too slow in practice, and can be infeasible as the size and dimension of the data grow. In this article, we review earlier solutions which can scale up well for large data sets, and compare their performances on simulated and real datasets. In addition, we propose a new, and computationally efficient algorithm based on a loss function interpretation, and show that it can scale up well in larger problems. We conclude with some discussions and recommendations of all the methods studied

    Similar works

    Full text

    thumbnail-image

    Available Versions