7,516 research outputs found
Dealing with Label Switching in Mixture Models Under Genuine Multimodality
The fitting of finite mixture models is an ill-defined estimation problem as completely different parameterizations can induce similar mixture distributions. This leads to multiple modes in the likelihood which is a problem for frequentist maximum likelihood estimation, and complicates statistical inference of Markov chain Monte Carlo draws in Bayesian estimation. For the analysis of the posterior density of these draws a suitable separation into different modes is desirable. In addition, a unique labelling of the component specific estimates is necessary to solve the label
switching problem. This paper presents and compares two approaches to achieve these goals: relabelling under multimodality and constrained clustering. The algorithmic details are discussed and their application is demonstrated on artificial and real-world data
Mixed LICORS: A Nonparametric Algorithm for Predictive State Reconstruction
We introduce 'mixed LICORS', an algorithm for learning nonlinear,
high-dimensional dynamics from spatio-temporal data, suitable for both
prediction and simulation. Mixed LICORS extends the recent LICORS algorithm
(Goerg and Shalizi, 2012) from hard clustering of predictive distributions to a
non-parametric, EM-like soft clustering. This retains the asymptotic predictive
optimality of LICORS, but, as we show in simulations, greatly improves
out-of-sample forecasts with limited data. The new method is implemented in the
publicly-available R package "LICORS"
(http://cran.r-project.org/web/packages/LICORS/).Comment: 11 pages; AISTATS 201
Multi-view constrained clustering with an incomplete mapping between views
Multi-view learning algorithms typically assume a complete bipartite mapping
between the different views in order to exchange information during the
learning process. However, many applications provide only a partial mapping
between the views, creating a challenge for current methods. To address this
problem, we propose a multi-view algorithm based on constrained clustering that
can operate with an incomplete mapping. Given a set of pairwise constraints in
each view, our approach propagates these constraints using a local similarity
measure to those instances that can be mapped to the other views, allowing the
propagated constraints to be transferred across views via the partial mapping.
It uses co-EM to iteratively estimate the propagation within each view based on
the current clustering model, transfer the constraints across views, and then
update the clustering model. By alternating the learning process between views,
this approach produces a unified clustering model that is consistent with all
views. We show that this approach significantly improves clustering performance
over several other methods for transferring constraints and allows multi-view
clustering to be reliably applied when given a limited mapping between the
views. Our evaluation reveals that the propagated constraints have high
precision with respect to the true clusters in the data, explaining their
benefit to clustering performance in both single- and multi-view learning
scenarios
Relabelling Algorithms for Large Dataset Mixture Models
Mixture models are flexible tools in density estimation and classification
problems. Bayesian estimation of such models typically relies on sampling from
the posterior distribution using Markov chain Monte Carlo. Label switching
arises because the posterior is invariant to permutations of the component
parameters. Methods for dealing with label switching have been studied fairly
extensively in the literature, with the most popular approaches being those
based on loss functions. However, many of these algorithms turn out to be too
slow in practice, and can be infeasible as the size and dimension of the data
grow. In this article, we review earlier solutions which can scale up well for
large data sets, and compare their performances on simulated and real datasets.
In addition, we propose a new, and computationally efficient algorithm based on
a loss function interpretation, and show that it can scale up well in larger
problems. We conclude with some discussions and recommendations of all the
methods studied
- …