32,672 research outputs found
A Robust and Flexible EM Algorithm for Mixtures of Elliptical Distributions with Missing Data
This paper tackles the problem of missing data imputation for noisy and
non-Gaussian data. A classical imputation method, the Expectation Maximization
(EM) algorithm for Gaussian mixture models, has shown interesting properties
when compared to other popular approaches such as those based on k-nearest
neighbors or on multiple imputations by chained equations. However, Gaussian
mixture models are known to be non-robust to heterogeneous data, which can lead
to poor estimation performance when the data is contaminated by outliers or
follows non-Gaussian distributions. To overcome this issue, a new EM algorithm
is investigated for mixtures of elliptical distributions with the property of
handling potential missing data. This paper shows that this problem reduces to
the estimation of a mixture of Angular Gaussian distributions under generic
assumptions (i.e., each sample is drawn from a mixture of elliptical
distributions, which is possibly different for one sample to another). In that
case, the complete-data likelihood associated with mixtures of elliptical
distributions is well adapted to the EM framework with missing data thanks to
its conditional distribution, which is shown to be a multivariate
-distribution. Experimental results on synthetic data demonstrate that the
proposed algorithm is robust to outliers and can be used with non-Gaussian
data. Furthermore, experiments conducted on real-world datasets show that this
algorithm is very competitive when compared to other classical imputation
methods
EMMIXcskew: an R Package for the Fitting of a Mixture of Canonical Fundamental Skew t-Distributions
This paper presents an R package EMMIXcskew for the fitting of the canonical
fundamental skew t-distribution (CFUST) and finite mixtures of this
distribution (FM-CFUST) via maximum likelihood (ML). The CFUST distribution
provides a flexible family of models to handle non-normal data, with parameters
for capturing skewness and heavy-tails in the data. It formally encompasses the
normal, t, and skew-normal distributions as special and/or limiting cases. A
few other versions of the skew t-distributions are also nested within the CFUST
distribution. In this paper, an Expectation-Maximization (EM) algorithm is
described for computing the ML estimates of the parameters of the FM-CFUST
model, and different strategies for initializing the algorithm are discussed
and illustrated. The methodology is implemented in the EMMIXcskew package, and
examples are presented using two real datasets. The EMMIXcskew package contains
functions to fit the FM-CFUST model, including procedures for generating
different initial values. Additional features include random sample generation
and contour visualization in 2D and 3D
Mixtures of Shifted Asymmetric Laplace Distributions
A mixture of shifted asymmetric Laplace distributions is introduced and used
for clustering and classification. A variant of the EM algorithm is developed
for parameter estimation by exploiting the relationship with the general
inverse Gaussian distribution. This approach is mathematically elegant and
relatively computationally straightforward. Our novel mixture modelling
approach is demonstrated on both simulated and real data to illustrate
clustering and classification applications. In these analyses, our mixture of
shifted asymmetric Laplace distributions performs favourably when compared to
the popular Gaussian approach. This work, which marks an important step in the
non-Gaussian model-based clustering and classification direction, concludes
with discussion as well as suggestions for future work
- …