9 research outputs found

    Flexible Mixture Modeling with the Polynomial Gaussian Cluster-Weighted Model

    Full text link
    In the mixture modeling frame, this paper presents the polynomial Gaussian cluster-weighted model (CWM). It extends the linear Gaussian CWM, for bivariate data, in a twofold way. Firstly, it allows for possible nonlinear dependencies in the mixture components by considering a polynomial regression. Secondly, it is not restricted to be used for model-based clustering only being contextualized in the most general model-based classification framework. Maximum likelihood parameter estimates are derived using the EM algorithm and model selection is carried out using the Bayesian information criterion (BIC) and the integrated completed likelihood (ICL). The paper also investigates the conditions under which the posterior probabilities of component-membership from a polynomial Gaussian CWM coincide with those of other well-established mixture-models which are related to it. With respect to these models, the polynomial Gaussian CWM has shown to give excellent clustering and classification results when applied to the artificial and real data considered in the paper

    Model-based clustering via linear cluster-weighted models

    Full text link
    A novel family of twelve mixture models with random covariates, nested in the linear tt cluster-weighted model (CWM), is introduced for model-based clustering. The linear tt CWM was recently presented as a robust alternative to the better known linear Gaussian CWM. The proposed family of models provides a unified framework that also includes the linear Gaussian CWM as a special case. Maximum likelihood parameter estimation is carried out within the EM framework, and both the BIC and the ICL are used for model selection. A simple and effective hierarchical random initialization is also proposed for the EM algorithm. The novel model-based clustering technique is illustrated in some applications to real data. Finally, a simulation study for evaluating the performance of the BIC and the ICL is presented

    Real Elliptically Skewed Distributions and Their Application to Robust Cluster Analysis

    Full text link
    This article proposes a new class of Real Elliptically Skewed (RESK) distributions and associated clustering algorithms that allow for integrating robustness and skewness into a single unified cluster analysis framework. Non-symmetrically distributed and heavy-tailed data clusters have been reported in a variety of real-world applications. Robustness is essential because a few outlying observations can severely obscure the cluster structure. The RESK distributions are a generalization of the Real Elliptically Symmetric (RES) distributions. To estimate the cluster parameters and memberships, we derive an expectation maximization (EM) algorithm for arbitrary RESK distributions. Special attention is given to a new robust skew-Huber M-estimator, which is also the maximum likelihood estimator (MLE) for the skew-Huber distribution that belongs to the RESK class. Numerical experiments on simulated and real-world data confirm the usefulness of the proposed methods for skewed and heavy-tailed data sets

    Statistics in the 150 years from Italian Unification. SIS 2011 Statistical Conference, Bologna, 8 – 10 June 2011. Book of short paper.

    Get PDF
    corecore