12,331 research outputs found
Bayesian mixture labeling and clustering
Label switching is one of the fundamental issues for Bayesian mixture modeling. It
occurs due to the nonidentifiability of the components under symmetric priors. Without
solving the label switching, the ergodic averages of component specific quantities will be identical and thus useless for inference relating to individual components, such as the posterior means, predictive component densities, and marginal classification probabilities. In this article, we establish the equivalence between the labeling and clustering and propose two simple clustering criteria to solve the label switching. The first method can be considered as an extension of K-means clustering. The second method is to find the labels by minimizing the volume of labeled samples and this method is invariant to the scale transformation of the parameters. Using a simulation example and two real data sets application, we demonstrate the success of our new methods in dealing with the label switching problem
Statistical inference with anchored Bayesian mixture of regressions models: A case study analysis of allometric data
We present a case study in which we use a mixture of regressions model to
improve on an ill-fitting simple linear regression model relating log brain
mass to log body mass for 100 placental mammalian species. The slope of this
regression model is of particular scientific interest because it corresponds to
a constant that governs a hypothesized allometric power law relating brain mass
to body mass. A specific line of investigation is to determine whether the
regression parameters vary across subgroups of related species.
We model these data using an anchored Bayesian mixture of regressions model,
which modifies the standard Bayesian Gaussian mixture by pre-assigning small
subsets of observations to given mixture components with probability one. These
observations (called anchor points) break the relabeling invariance typical of
exchangeable model specifications (the so-called label-switching problem). A
careful choice of which observations to pre-classify to which mixture
components is key to the specification of a well-fitting anchor model.
In the article we compare three strategies for the selection of anchor
points. The first assumes that the underlying mixture of regressions model
holds and assigns anchor points to different components to maximize the
information about their labeling. The second makes no assumption about the
relationship between x and y and instead identifies anchor points using a
bivariate Gaussian mixture model. The third strategy begins with the assumption
that there is only one mixture regression component and identifies anchor
points that are representative of a clustering structure based on case-deletion
importance sampling weights. We compare the performance of the three strategies
on the allometric data set and use auxiliary taxonomic information about the
species to evaluate the model-based classifications estimated from these
models
Better Optimism By Bayes: Adaptive Planning with Rich Models
The computational costs of inference and planning have confined Bayesian
model-based reinforcement learning to one of two dismal fates: powerful
Bayes-adaptive planning but only for simplistic models, or powerful, Bayesian
non-parametric models but using simple, myopic planning strategies such as
Thompson sampling. We ask whether it is feasible and truly beneficial to
combine rich probabilistic models with a closer approximation to fully Bayesian
planning. First, we use a collection of counterexamples to show formal problems
with the over-optimism inherent in Thompson sampling. Then we leverage
state-of-the-art techniques in efficient Bayes-adaptive planning and
non-parametric Bayesian methods to perform qualitatively better than both
existing conventional algorithms and Thompson sampling on two contextual
bandit-like problems.Comment: 11 pages, 11 figure
- …