17,260 research outputs found

    Supervised Classification Using Copula and Mixture Copula

    Get PDF
    Statistical classification is a field of study that has developed significantly after 1960\u27s. This research has a vast area of applications. For example, pattern recognition has been proposed for automatic character recognition, medical diagnostic and most recently in data mining. Classical discrimination rule assumes normality. However in many situations, this assumption is often questionable. In fact for some data, the pattern vector is a mixture of discrete and continuous random variables. In this dissertation, we use copula densities to model class conditional distributions. Such types of densities are useful when the marginal densities of a pattern vector are not normally distributed. This type of models are also useful for a mixed discrete and continuous feature types. Finite mixture density models are very flexible in building classifier and clustering, and for uncovering hidden structures in the data. We use finite mixture Gaussian copula and copula of the Archimedean family based mixture densities to build classifier. The complexities of the estimation are presented. Under such mixture models, maximum likelihood estimation methods are not suitable and regular expectation maximization algorithm may not converge, and if it does, not efficiently. We propose a new estimation method to evaluate such densities and build the classifier based on finite mixture of copula densities. We develop simulations scenarios to compare the performance of the copula based classifier with classical normal distribution based models, the logistic regression based model and the Independent model. We also apply the techniques to real data, and present the misclassification errors

    Mixture of Bilateral-Projection Two-dimensional Probabilistic Principal Component Analysis

    Full text link
    The probabilistic principal component analysis (PPCA) is built upon a global linear mapping, with which it is insufficient to model complex data variation. This paper proposes a mixture of bilateral-projection probabilistic principal component analysis model (mixB2DPPCA) on 2D data. With multi-components in the mixture, this model can be seen as a soft cluster algorithm and has capability of modeling data with complex structures. A Bayesian inference scheme has been proposed based on the variational EM (Expectation-Maximization) approach for learning model parameters. Experiments on some publicly available databases show that the performance of mixB2DPPCA has been largely improved, resulting in more accurate reconstruction errors and recognition rates than the existing PCA-based algorithms

    EM Algorithms for Weighted-Data Clustering with Application to Audio-Visual Scene Analysis

    Get PDF
    Data clustering has received a lot of attention and numerous methods, algorithms and software packages are available. Among these techniques, parametric finite-mixture models play a central role due to their interesting mathematical properties and to the existence of maximum-likelihood estimators based on expectation-maximization (EM). In this paper we propose a new mixture model that associates a weight with each observed point. We introduce the weighted-data Gaussian mixture and we derive two EM algorithms. The first one considers a fixed weight for each observation. The second one treats each weight as a random variable following a gamma distribution. We propose a model selection method based on a minimum message length criterion, provide a weight initialization strategy, and validate the proposed algorithms by comparing them with several state of the art parametric and non-parametric clustering techniques. We also demonstrate the effectiveness and robustness of the proposed clustering technique in the presence of heterogeneous data, namely audio-visual scene analysis.Comment: 14 pages, 4 figures, 4 table
    • …
    corecore