150,028 research outputs found

    From here to infinity - sparse finite versus Dirichlet process mixtures in model-based clustering

    Get PDF
    In model-based-clustering mixture models are used to group data points into clusters. A useful concept introduced for Gaussian mixtures by Malsiner Walli et al (2016) are sparse finite mixtures, where the prior distribution on the weight distribution of a mixture with KK components is chosen in such a way that a priori the number of clusters in the data is random and is allowed to be smaller than KK with high probability. The number of cluster is then inferred a posteriori from the data. The present paper makes the following contributions in the context of sparse finite mixture modelling. First, it is illustrated that the concept of sparse finite mixture is very generic and easily extended to cluster various types of non-Gaussian data, in particular discrete data and continuous multivariate data arising from non-Gaussian clusters. Second, sparse finite mixtures are compared to Dirichlet process mixtures with respect to their ability to identify the number of clusters. For both model classes, a random hyper prior is considered for the parameters determining the weight distribution. By suitable matching of these priors, it is shown that the choice of this hyper prior is far more influential on the cluster solution than whether a sparse finite mixture or a Dirichlet process mixture is taken into consideration.Comment: Accepted versio

    Model Selection for Gaussian Mixture Models

    Full text link
    This paper is concerned with an important issue in finite mixture modelling, the selection of the number of mixing components. We propose a new penalized likelihood method for model selection of finite multivariate Gaussian mixture models. The proposed method is shown to be statistically consistent in determining of the number of components. A modified EM algorithm is developed to simultaneously select the number of components and to estimate the mixing weights, i.e. the mixing probabilities, and unknown parameters of Gaussian distributions. Simulations and a real data analysis are presented to illustrate the performance of the proposed method

    A Tight Convex Upper Bound on the Likelihood of a Finite Mixture

    Full text link
    The likelihood function of a finite mixture model is a non-convex function with multiple local maxima and commonly used iterative algorithms such as EM will converge to different solutions depending on initial conditions. In this paper we ask: is it possible to assess how far we are from the global maximum of the likelihood? Since the likelihood of a finite mixture model can grow unboundedly by centering a Gaussian on a single datapoint and shrinking the covariance, we constrain the problem by assuming that the parameters of the individual models are members of a large discrete set (e.g. estimating a mixture of two Gaussians where the means and variances of both Gaussians are members of a set of a million possible means and variances). For this setting we show that a simple upper bound on the likelihood can be computed using convex optimization and we analyze conditions under which the bound is guaranteed to be tight. This bound can then be used to assess the quality of solutions found by EM (where the final result is projected on the discrete set) or any other mixture estimation algorithm. For any dataset our method allows us to find a finite mixture model together with a dataset-specific bound on how far the likelihood of this mixture is from the global optimum of the likelihoodComment: icpr 201

    Introduction to finite mixtures

    Full text link
    Mixture models have been around for over 150 years, as an intuitively simple and practical tool for enriching the collection of probability distributions available for modelling data. In this chapter we describe the basic ideas of the subject, present several alternative representations and perspectives on these models, and discuss some of the elements of inference about the unknowns in the models. Our focus is on the simplest set-up, of finite mixture models, but we discuss also how various simplifying assumptions can be relaxed to generate the rich landscape of modelling and inference ideas traversed in the rest of this book.Comment: 14 pages, 7 figures, A chapter prepared for the forthcoming Handbook of Mixture Analysis. V2 corrects a small but important typographical error, and makes other minor edits; V3 makes further minor corrections and updates following review; V4 corrects algorithmic details in sec 4.1 and 4.2, and removes typo
    corecore