Search CORE

22,532 research outputs found

Bayesian estimation and classification with incomplete data using mixture models

Author: Everson Richard M.
Zhang Jufen
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2004
Field of study

©2004 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works.Reasoning from data in practical problems is frequently hampered by missing observations. Mixture models provide a powerful general semi-parametric method for modelling densities and have close links to radial basis function neural networks (RBFs). We extend the Data Augmentation (DA) technique for multiple imputation to Gaussian mixture models to permit fully Bayesian inference of model parameters and estimation of the missing values. The method is compared to imputation using a single normal density on synthetic and real-world data. In addition to a lower mean squared error than can be achieved by simple imputation methods, mixture Models provide valuable information on the potentially multi-modal nature of imputed values. The DA formalism is extended to a classifier closely related to RBF networks permitting Bayesian classification with incomplete data; the technique is illustrated on synthetic and real datasets

Crossref

Open Research Exeter

Imputation Estimators Partially Correct for Model Misspecification

Author: Minin Vladimir N.
O'Brien John D.
Seregin Arseni
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 23/04/2010
Field of study

Inference problems with incomplete observations often aim at estimating population properties of unobserved quantities. One simple way to accomplish this estimation is to impute the unobserved quantities of interest at the individual level and then take an empirical average of the imputed values. We show that this simple imputation estimator can provide partial protection against model misspecification. We illustrate imputation estimators' robustness to model specification on three examples: mixture model-based clustering, estimation of genotype frequencies in population genetics, and estimation of Markovian evolutionary distances. In the final example, using a representative model misspecification, we demonstrate that in non-degenerate cases, the imputation estimator dominates the plug-in estimate asymptotically. We conclude by outlining a Bayesian implementation of the imputation-based estimation.Comment: major rewrite, beta-binomial example removed, model based clustering is added to the mixture model example, Bayesian approach is now illustrated with the genetics exampl

arXiv.org e-Print Archive

CiteSeerX

Crossref

A self-organising mixture network for density modelling

Author: Allinson N. M.
Yin H.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/1998
Field of study

A completely unsupervised mixture distribution network, namely the self-organising mixture network, is proposed for learning arbitrary density functions. The algorithm minimises the Kullback-Leibler information by means of stochastic approximation methods. The density functions are modelled as mixtures of parametric distributions such as Gaussian and Cauchy. The first layer of the network is similar to the Kohonen's self-organising map (SOM), but with the parameters of the class conditional densities as the learning weights. The winning mechanism is based on maximum posterior probability, and the updating of weights can be limited to a small neighbourhood around the winner. The second layer accumulates the responses of these local nodes, weighted by the learning mixing parameters. The network possesses simple structure and computation, yet yields fast and robust convergence. Experimental results are also presente

University of Lincoln Institutional Repository

Crossref

Uncovering latent structure in valued graphs: A variational approach

Author: Mariadassou Mahendra
Robin Stéphane
Vacher Corinne
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2010
Field of study

As more and more network-structured data sets are available, the statistical analysis of valued graphs has become common place. Looking for a latent structure is one of the many strategies used to better understand the behavior of a network. Several methods already exist for the binary case. We present a model-based strategy to uncover groups of nodes in valued graphs. This framework can be used for a wide span of parametric random graphs models and allows to include covariates. Variational tools allow us to achieve approximate maximum likelihood estimation of the parameters of these models. We provide a simulation study showing that our estimation method performs well over a broad range of situations. We apply this method to analyze host--parasite interaction networks in forest ecosystems.Comment: Published in at http://dx.doi.org/10.1214/10-AOAS361 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Probabilistic Methodology and Techniques for Artefact Conception and Development

Author: Bessiere Dr P
Publication venue
Publication date: 01/01/2003
Field of study

The purpose of this paper is to make a state of the art on probabilistic methodology and techniques for artefact conception and development. It is the 8th deliverable of the BIBA (Bayesian Inspired Brain and Artefacts) project. We first present the incompletness problem as the central difficulty that both living creatures and artefacts have to face: how can they perceive, infer, decide and act efficiently with incomplete and uncertain knowledge?. We then introduce a generic probabilistic formalism called Bayesian Programming. This formalism is then used to review the main probabilistic methodology and techniques. This review is organized in 3 parts: first the probabilistic models from Bayesian networks to Kalman filters and from sensor fusion to CAD systems, second the inference techniques and finally the learning and model acquisition and comparison methodologies. We conclude with the perspectives of the BIBA project as they rise from this state of the art

Mixtures of Skew-t Factor Analyzers

Author: Browne Ryan P.
McNicholas Paul D.
Murray Paula M.
Publication venue: 'Elsevier BV'
Publication date: 18/06/2013
Field of study

In this paper, we introduce a mixture of skew-t factor analyzers as well as a family of mixture models based thereon. The mixture of skew-t distributions model that we use arises as a limiting case of the mixture of generalized hyperbolic distributions. Like their Gaussian and t-distribution analogues, our mixture of skew-t factor analyzers are very well-suited to the model-based clustering of high-dimensional data. Imposing constraints on components of the decomposed covariance parameter results in the development of eight flexible models. The alternating expectation-conditional maximization algorithm is used for model parameter estimation and the Bayesian information criterion is used for model selection. The models are applied to both real and simulated data, giving superior clustering results compared to a well-established family of Gaussian mixture models

arXiv.org e-Print Archive

CiteSeerX

Elsevier - Publisher Connector