17,260 research outputs found
Supervised Classification Using Copula and Mixture Copula
Statistical classification is a field of study that has developed significantly after 1960\u27s. This research has a vast area of applications. For example, pattern recognition has been proposed for automatic character recognition, medical diagnostic and most recently in data mining. Classical discrimination rule assumes normality. However in many situations, this assumption is often questionable. In fact for some data, the pattern vector is a mixture of discrete and continuous random variables. In this dissertation, we use copula densities to model class conditional distributions. Such types of densities are useful when the marginal densities of a pattern vector are not normally distributed. This type of models are also useful for a mixed discrete and continuous feature types. Finite mixture density models are very flexible in building classifier and clustering, and for uncovering hidden structures in the data. We use finite mixture Gaussian copula and copula of the Archimedean family based mixture densities to build classifier. The complexities of the estimation are presented. Under such mixture models, maximum likelihood estimation methods are not suitable and regular expectation maximization algorithm may not converge, and if it does, not efficiently. We propose a new estimation method to evaluate such densities and build the classifier based on finite mixture of copula densities. We develop simulations scenarios to compare the performance of the copula based classifier with classical normal distribution based models, the logistic regression based model and the Independent model. We also apply the techniques to real data, and present the misclassification errors
Mixture of Bilateral-Projection Two-dimensional Probabilistic Principal Component Analysis
The probabilistic principal component analysis (PPCA) is built upon a global
linear mapping, with which it is insufficient to model complex data variation.
This paper proposes a mixture of bilateral-projection probabilistic principal
component analysis model (mixB2DPPCA) on 2D data. With multi-components in the
mixture, this model can be seen as a soft cluster algorithm and has capability
of modeling data with complex structures. A Bayesian inference scheme has been
proposed based on the variational EM (Expectation-Maximization) approach for
learning model parameters. Experiments on some publicly available databases
show that the performance of mixB2DPPCA has been largely improved, resulting in
more accurate reconstruction errors and recognition rates than the existing
PCA-based algorithms
EM Algorithms for Weighted-Data Clustering with Application to Audio-Visual Scene Analysis
Data clustering has received a lot of attention and numerous methods,
algorithms and software packages are available. Among these techniques,
parametric finite-mixture models play a central role due to their interesting
mathematical properties and to the existence of maximum-likelihood estimators
based on expectation-maximization (EM). In this paper we propose a new mixture
model that associates a weight with each observed point. We introduce the
weighted-data Gaussian mixture and we derive two EM algorithms. The first one
considers a fixed weight for each observation. The second one treats each
weight as a random variable following a gamma distribution. We propose a model
selection method based on a minimum message length criterion, provide a weight
initialization strategy, and validate the proposed algorithms by comparing them
with several state of the art parametric and non-parametric clustering
techniques. We also demonstrate the effectiveness and robustness of the
proposed clustering technique in the presence of heterogeneous data, namely
audio-visual scene analysis.Comment: 14 pages, 4 figures, 4 table
- …