62,514 research outputs found
Recommended from our members
Sequential Monte Carlo filtering with Gaussian mixture models for highly nonlinear systems
This dissertation presents two different Bayesian approaches for highly nonlinear systems with a theoretical study on combining the benefits of the Gaussian sum filter and particle filter; the posterior particles of a particle filter are drawn from a Gaussian mixture model approximation of the posterior distribution. The first approach introduces the methods which change each and every particle of a particle filter into a Gaussian mixture component, either using the properties of Dirac delta function or using kernel density estimation; the former treats each particle of the prior distribution as a Gaussian component with a collapsed zero covariance matrix and the latter estimates the covariance matrix of a Gaussian component using the kernel density estimation algorithm. The Gaussian sum filter is then used to calculate the posterior distribution. The second approach uses clustering algorithms. These clustering algorithms are used to recover Gaussian mixture model representation of the prior probability density function from the propagated particles. The expectation-maximization clustering algorithm and modified fuzzy C-means clustering algorithms are applied to this approach. Under the scenarios considered in this study, it is shown through numerical simulations that the proposed algorithms lead to better performances than the existing algorithms such as Gaussian sum filters and particle filters.Aerospace Engineerin
Robust EM algorithm for model-based curve clustering
Model-based clustering approaches concern the paradigm of exploratory data
analysis relying on the finite mixture model to automatically find a latent
structure governing observed data. They are one of the most popular and
successful approaches in cluster analysis. The mixture density estimation is
generally performed by maximizing the observed-data log-likelihood by using the
expectation-maximization (EM) algorithm. However, it is well-known that the EM
algorithm initialization is crucial. In addition, the standard EM algorithm
requires the number of clusters to be known a priori. Some solutions have been
provided in [31, 12] for model-based clustering with Gaussian mixture models
for multivariate data. In this paper we focus on model-based curve clustering
approaches, when the data are curves rather than vectorial data, based on
regression mixtures. We propose a new robust EM algorithm for clustering
curves. We extend the model-based clustering approach presented in [31] for
Gaussian mixture models, to the case of curve clustering by regression
mixtures, including polynomial regression mixtures as well as spline or
B-spline regressions mixtures. Our approach both handles the problem of
initialization and the one of choosing the optimal number of clusters as the EM
learning proceeds, rather than in a two-fold scheme. This is achieved by
optimizing a penalized log-likelihood criterion. A simulation study confirms
the potential benefit of the proposed algorithm in terms of robustness
regarding initialization and funding the actual number of clusters.Comment: In Proceedings of the 2013 International Joint Conference on Neural
Networks (IJCNN), 2013, Dallas, TX, US
On Convergence of Epanechnikov Mean Shift
Epanechnikov Mean Shift is a simple yet empirically very effective algorithm
for clustering. It localizes the centroids of data clusters via estimating
modes of the probability distribution that generates the data points, using the
`optimal' Epanechnikov kernel density estimator. However, since the procedure
involves non-smooth kernel density functions, the convergence behavior of
Epanechnikov mean shift lacks theoretical support as of this writing---most of
the existing analyses are based on smooth functions and thus cannot be applied
to Epanechnikov Mean Shift. In this work, we first show that the original
Epanechnikov Mean Shift may indeed terminate at a non-critical point, due to
the non-smoothness nature. Based on our analysis, we propose a simple remedy to
fix it. The modified Epanechnikov Mean Shift is guaranteed to terminate at a
local maximum of the estimated density, which corresponds to a cluster
centroid, within a finite number of iterations. We also propose a way to avoid
running the Mean Shift iterates from every data point, while maintaining good
clustering accuracies under non-overlapping spherical Gaussian mixture models.
This further pushes Epanechnikov Mean Shift to handle very large and
high-dimensional data sets. Experiments show surprisingly good performance
compared to the Lloyd's K-means algorithm and the EM algorithm.Comment: AAAI 201
On clustering procedures and nonparametric mixture estimation
This paper deals with nonparametric estimation of conditional den-sities in
mixture models in the case when additional covariates are available. The
proposed approach consists of performing a prelim-inary clustering algorithm on
the additional covariates to guess the mixture component of each observation.
Conditional densities of the mixture model are then estimated using kernel
density estimates ap-plied separately to each cluster. We investigate the
expected L 1 -error of the resulting estimates and derive optimal rates of
convergence over classical nonparametric density classes provided the
clustering method is accurate. Performances of clustering algorithms are
measured by the maximal misclassification error. We obtain upper bounds of this
quantity for a single linkage hierarchical clustering algorithm. Lastly,
applications of the proposed method to mixture models involving elec-tricity
distribution data and simulated data are presented
- …