2,808 research outputs found

    Identifying Mixtures of Mixtures Using Bayesian Estimation

    Get PDF
    The use of a finite mixture of normal distributions in model-based clustering allows to capture non-Gaussian data clusters. However, identifying the clusters from the normal components is challenging and in general either achieved by imposing constraints on the model or by using post-processing procedures. Within the Bayesian framework we propose a different approach based on sparse finite mixtures to achieve identifiability. We specify a hierarchical prior where the hyperparameters are carefully selected such that they are reflective of the cluster structure aimed at. In addition this prior allows to estimate the model using standard MCMC sampling methods. In combination with a post-processing approach which resolves the label switching issue and results in an identified model, our approach allows to simultaneously (1) determine the number of clusters, (2) flexibly approximate the cluster distributions in a semi-parametric way using finite mixtures of normals and (3) identify cluster-specific parameters and classify observations. The proposed approach is illustrated in two simulation studies and on benchmark data sets.Comment: 49 page

    Parsimonious Shifted Asymmetric Laplace Mixtures

    Full text link
    A family of parsimonious shifted asymmetric Laplace mixture models is introduced. We extend the mixture of factor analyzers model to the shifted asymmetric Laplace distribution. Imposing constraints on the constitute parts of the resulting decomposed component scale matrices leads to a family of parsimonious models. An explicit two-stage parameter estimation procedure is described, and the Bayesian information criterion and the integrated completed likelihood are compared for model selection. This novel family of models is applied to real data, where it is compared to its Gaussian analogue within clustering and classification paradigms

    Unsupervised Learning via Mixtures of Skewed Distributions with Hypercube Contours

    Full text link
    Mixture models whose components have skewed hypercube contours are developed via a generalization of the multivariate shifted asymmetric Laplace density. Specifically, we develop mixtures of multiple scaled shifted asymmetric Laplace distributions. The component densities have two unique features: they include a multivariate weight function, and the marginal distributions are also asymmetric Laplace. We use these mixtures of multiple scaled shifted asymmetric Laplace distributions for clustering applications, but they could equally well be used in the supervised or semi-supervised paradigms. The expectation-maximization algorithm is used for parameter estimation and the Bayesian information criterion is used for model selection. Simulated and real data sets are used to illustrate the approach and, in some cases, to visualize the skewed hypercube structure of the components

    Mixtures of Skew-t Factor Analyzers

    Get PDF
    In this paper, we introduce a mixture of skew-t factor analyzers as well as a family of mixture models based thereon. The mixture of skew-t distributions model that we use arises as a limiting case of the mixture of generalized hyperbolic distributions. Like their Gaussian and t-distribution analogues, our mixture of skew-t factor analyzers are very well-suited to the model-based clustering of high-dimensional data. Imposing constraints on components of the decomposed covariance parameter results in the development of eight flexible models. The alternating expectation-conditional maximization algorithm is used for model parameter estimation and the Bayesian information criterion is used for model selection. The models are applied to both real and simulated data, giving superior clustering results compared to a well-established family of Gaussian mixture models

    Mixtures of Shifted Asymmetric Laplace Distributions

    Full text link
    A mixture of shifted asymmetric Laplace distributions is introduced and used for clustering and classification. A variant of the EM algorithm is developed for parameter estimation by exploiting the relationship with the general inverse Gaussian distribution. This approach is mathematically elegant and relatively computationally straightforward. Our novel mixture modelling approach is demonstrated on both simulated and real data to illustrate clustering and classification applications. In these analyses, our mixture of shifted asymmetric Laplace distributions performs favourably when compared to the popular Gaussian approach. This work, which marks an important step in the non-Gaussian model-based clustering and classification direction, concludes with discussion as well as suggestions for future work

    Recent advances in directional statistics

    Get PDF
    Mainstream statistical methodology is generally applicable to data observed in Euclidean space. There are, however, numerous contexts of considerable scientific interest in which the natural supports for the data under consideration are Riemannian manifolds like the unit circle, torus, sphere and their extensions. Typically, such data can be represented using one or more directions, and directional statistics is the branch of statistics that deals with their analysis. In this paper we provide a review of the many recent developments in the field since the publication of Mardia and Jupp (1999), still the most comprehensive text on directional statistics. Many of those developments have been stimulated by interesting applications in fields as diverse as astronomy, medicine, genetics, neurology, aeronautics, acoustics, image analysis, text mining, environmetrics, and machine learning. We begin by considering developments for the exploratory analysis of directional data before progressing to distributional models, general approaches to inference, hypothesis testing, regression, nonparametric curve estimation, methods for dimension reduction, classification and clustering, and the modelling of time series, spatial and spatio-temporal data. An overview of currently available software for analysing directional data is also provided, and potential future developments discussed.Comment: 61 page
    corecore