9 research outputs found
Flexible Mixture Modeling with the Polynomial Gaussian Cluster-Weighted Model
In the mixture modeling frame, this paper presents the polynomial Gaussian
cluster-weighted model (CWM). It extends the linear Gaussian CWM, for bivariate
data, in a twofold way. Firstly, it allows for possible nonlinear dependencies
in the mixture components by considering a polynomial regression. Secondly, it
is not restricted to be used for model-based clustering only being
contextualized in the most general model-based classification framework.
Maximum likelihood parameter estimates are derived using the EM algorithm and
model selection is carried out using the Bayesian information criterion (BIC)
and the integrated completed likelihood (ICL). The paper also investigates the
conditions under which the posterior probabilities of component-membership from
a polynomial Gaussian CWM coincide with those of other well-established
mixture-models which are related to it. With respect to these models, the
polynomial Gaussian CWM has shown to give excellent clustering and
classification results when applied to the artificial and real data considered
in the paper
Model-based clustering via linear cluster-weighted models
A novel family of twelve mixture models with random covariates, nested in the
linear cluster-weighted model (CWM), is introduced for model-based
clustering. The linear CWM was recently presented as a robust alternative
to the better known linear Gaussian CWM. The proposed family of models provides
a unified framework that also includes the linear Gaussian CWM as a special
case. Maximum likelihood parameter estimation is carried out within the EM
framework, and both the BIC and the ICL are used for model selection. A simple
and effective hierarchical random initialization is also proposed for the EM
algorithm. The novel model-based clustering technique is illustrated in some
applications to real data. Finally, a simulation study for evaluating the
performance of the BIC and the ICL is presented
Real Elliptically Skewed Distributions and Their Application to Robust Cluster Analysis
This article proposes a new class of Real Elliptically Skewed (RESK)
distributions and associated clustering algorithms that allow for integrating
robustness and skewness into a single unified cluster analysis framework.
Non-symmetrically distributed and heavy-tailed data clusters have been reported
in a variety of real-world applications. Robustness is essential because a few
outlying observations can severely obscure the cluster structure. The RESK
distributions are a generalization of the Real Elliptically Symmetric (RES)
distributions. To estimate the cluster parameters and memberships, we derive an
expectation maximization (EM) algorithm for arbitrary RESK distributions.
Special attention is given to a new robust skew-Huber M-estimator, which is
also the maximum likelihood estimator (MLE) for the skew-Huber distribution
that belongs to the RESK class. Numerical experiments on simulated and
real-world data confirm the usefulness of the proposed methods for skewed and
heavy-tailed data sets