2,808 research outputs found
Identifying Mixtures of Mixtures Using Bayesian Estimation
The use of a finite mixture of normal distributions in model-based clustering
allows to capture non-Gaussian data clusters. However, identifying the clusters
from the normal components is challenging and in general either achieved by
imposing constraints on the model or by using post-processing procedures.
Within the Bayesian framework we propose a different approach based on sparse
finite mixtures to achieve identifiability. We specify a hierarchical prior
where the hyperparameters are carefully selected such that they are reflective
of the cluster structure aimed at. In addition this prior allows to estimate
the model using standard MCMC sampling methods. In combination with a
post-processing approach which resolves the label switching issue and results
in an identified model, our approach allows to simultaneously (1) determine the
number of clusters, (2) flexibly approximate the cluster distributions in a
semi-parametric way using finite mixtures of normals and (3) identify
cluster-specific parameters and classify observations. The proposed approach is
illustrated in two simulation studies and on benchmark data sets.Comment: 49 page
Parsimonious Shifted Asymmetric Laplace Mixtures
A family of parsimonious shifted asymmetric Laplace mixture models is
introduced. We extend the mixture of factor analyzers model to the shifted
asymmetric Laplace distribution. Imposing constraints on the constitute parts
of the resulting decomposed component scale matrices leads to a family of
parsimonious models. An explicit two-stage parameter estimation procedure is
described, and the Bayesian information criterion and the integrated completed
likelihood are compared for model selection. This novel family of models is
applied to real data, where it is compared to its Gaussian analogue within
clustering and classification paradigms
Unsupervised Learning via Mixtures of Skewed Distributions with Hypercube Contours
Mixture models whose components have skewed hypercube contours are developed
via a generalization of the multivariate shifted asymmetric Laplace density.
Specifically, we develop mixtures of multiple scaled shifted asymmetric Laplace
distributions. The component densities have two unique features: they include a
multivariate weight function, and the marginal distributions are also
asymmetric Laplace. We use these mixtures of multiple scaled shifted asymmetric
Laplace distributions for clustering applications, but they could equally well
be used in the supervised or semi-supervised paradigms. The
expectation-maximization algorithm is used for parameter estimation and the
Bayesian information criterion is used for model selection. Simulated and real
data sets are used to illustrate the approach and, in some cases, to visualize
the skewed hypercube structure of the components
Mixtures of Skew-t Factor Analyzers
In this paper, we introduce a mixture of skew-t factor analyzers as well as a
family of mixture models based thereon. The mixture of skew-t distributions
model that we use arises as a limiting case of the mixture of generalized
hyperbolic distributions. Like their Gaussian and t-distribution analogues, our
mixture of skew-t factor analyzers are very well-suited to the model-based
clustering of high-dimensional data. Imposing constraints on components of the
decomposed covariance parameter results in the development of eight flexible
models. The alternating expectation-conditional maximization algorithm is used
for model parameter estimation and the Bayesian information criterion is used
for model selection. The models are applied to both real and simulated data,
giving superior clustering results compared to a well-established family of
Gaussian mixture models
Mixtures of Shifted Asymmetric Laplace Distributions
A mixture of shifted asymmetric Laplace distributions is introduced and used
for clustering and classification. A variant of the EM algorithm is developed
for parameter estimation by exploiting the relationship with the general
inverse Gaussian distribution. This approach is mathematically elegant and
relatively computationally straightforward. Our novel mixture modelling
approach is demonstrated on both simulated and real data to illustrate
clustering and classification applications. In these analyses, our mixture of
shifted asymmetric Laplace distributions performs favourably when compared to
the popular Gaussian approach. This work, which marks an important step in the
non-Gaussian model-based clustering and classification direction, concludes
with discussion as well as suggestions for future work
Recent advances in directional statistics
Mainstream statistical methodology is generally applicable to data observed
in Euclidean space. There are, however, numerous contexts of considerable
scientific interest in which the natural supports for the data under
consideration are Riemannian manifolds like the unit circle, torus, sphere and
their extensions. Typically, such data can be represented using one or more
directions, and directional statistics is the branch of statistics that deals
with their analysis. In this paper we provide a review of the many recent
developments in the field since the publication of Mardia and Jupp (1999),
still the most comprehensive text on directional statistics. Many of those
developments have been stimulated by interesting applications in fields as
diverse as astronomy, medicine, genetics, neurology, aeronautics, acoustics,
image analysis, text mining, environmetrics, and machine learning. We begin by
considering developments for the exploratory analysis of directional data
before progressing to distributional models, general approaches to inference,
hypothesis testing, regression, nonparametric curve estimation, methods for
dimension reduction, classification and clustering, and the modelling of time
series, spatial and spatio-temporal data. An overview of currently available
software for analysing directional data is also provided, and potential future
developments discussed.Comment: 61 page
- …