Search CORE

4,226 research outputs found

$k$ -MLE: A fast algorithm for learning statistical mixture models

Author: Nielsen Frank
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 23/03/2012
Field of study

We describe

k

-MLE, a fast and efficient local search algorithm for learning finite statistical mixtures of exponential families such as Gaussian mixture models. Mixture models are traditionally learned using the expectation-maximization (EM) soft clustering technique that monotonically increases the incomplete (expected complete) likelihood. Given prescribed mixture weights, the hard clustering

k

-MLE algorithm iteratively assigns data to the most likely weighted component and update the component models using Maximum Likelihood Estimators (MLEs). Using the duality between exponential families and Bregman divergences, we prove that the local convergence of the complete likelihood of

k

-MLE follows directly from the convergence of a dual additively weighted Bregman hard clustering. The inner loop of

k

-MLE can be implemented using any

k

-means heuristic like the celebrated Lloyd's batched or Hartigan's greedy swap updates. We then show how to update the mixture weights by minimizing a cross-entropy criterion that implies to update weights by taking the relative proportion of cluster points, and reiterate the mixture parameter update and mixture weight update processes until convergence. Hard EM is interpreted as a special case of

k

-MLE when both the component update and the weight update are performed successively in the inner loop. To initialize

k

-MLE, we propose

k

-MLE++, a careful initialization of

k

-MLE guaranteeing probabilistically a global bound on the best possible complete likelihood.Comment: 31 pages, Extend preliminary paper presented at IEEE ICASSP 201

arXiv.org e-Print Archive

Crossref

Cramer-Rao Lower Bound and Information Geometry

Author: Nielsen Frank
Publication venue
Publication date: 23/01/2013
Field of study

This article focuses on an important piece of work of the world renowned Indian statistician, Calyampudi Radhakrishna Rao. In 1945, C. R. Rao (25 years old then) published a pathbreaking paper, which had a profound impact on subsequent statistical research.Comment: To appear in Connected at Infinity II: On the work of Indian mathematicians (R. Bhatia and C.S. Rajan, Eds.), special volume of Texts and Readings In Mathematics (TRIM), Hindustan Book Agency, 201

arXiv.org e-Print Archive

Entropic optimal transport is maximum-likelihood deconvolution

Author: Rigollet Philippe
Weed Jonathan
Publication venue
Publication date: 01/01/2018
Field of study

We give a statistical interpretation of entropic optimal transport by showing that performing maximum-likelihood estimation for Gaussian deconvolution corresponds to calculating a projection with respect to the entropic optimal transport distance. This structural result gives theoretical support for the wide adoption of these tools in the machine learning community

arXiv.org e-Print Archive

DSpace@MIT

Comptes Rendus Mathématique

Numérisation de Documents Anciens Mathématiques

Minimum Rates of Approximate Sufficient Statistics

Author: Hayashi Masahito
Tan Vincent Y. F.
Publication venue
Publication date: 16/11/2017
Field of study

Given a sufficient statistic for a parametric family of distributions, one can estimate the parameter without access to the data. However, the memory or code size for storing the sufficient statistic may nonetheless still be prohibitive. Indeed, for

n

independent samples drawn from a

k

-nomial distribution with

d=k-1

degrees of freedom, the length of the code scales as

d\log n+O(1)

. In many applications, we may not have a useful notion of sufficient statistics (e.g., when the parametric family is not an exponential family) and we also may not need to reconstruct the generating distribution exactly. By adopting a Shannon-theoretic approach in which we allow a small error in estimating the generating distribution, we construct various {\em approximate sufficient statistics} and show that the code length can be reduced to

\frac{d}{2}\log n+O(1)

. We consider errors measured according to the relative entropy and variational distance criteria. For the code constructions, we leverage Rissanen's minimum description length principle, which yields a non-vanishing error measured according to the relative entropy. For the converse parts, we use Clarke and Barron's formula for the relative entropy of a parametrized distribution and the corresponding mixture distribution. However, this method only yields a weak converse for the variational distance. We develop new techniques to achieve vanishing errors and we also prove strong converses. The latter means that even if the code is allowed to have a non-vanishing error, its length must still be at least

\frac{d}{2}\log n

.Comment: To appear in the IEEE Transactions on Information Theor

arXiv.org e-Print Archive

Crossref

Uncovering latent structure in valued graphs: A variational approach

Author: Mariadassou Mahendra
Robin Stéphane
Vacher Corinne
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2010
Field of study

As more and more network-structured data sets are available, the statistical analysis of valued graphs has become common place. Looking for a latent structure is one of the many strategies used to better understand the behavior of a network. Several methods already exist for the binary case. We present a model-based strategy to uncover groups of nodes in valued graphs. This framework can be used for a wide span of parametric random graphs models and allows to include covariates. Variational tools allow us to achieve approximate maximum likelihood estimation of the parameters of these models. We provide a simulation study showing that our estimation method performs well over a broad range of situations. We apply this method to analyze host--parasite interaction networks in forest ecosystems.Comment: Published in at http://dx.doi.org/10.1214/10-AOAS361 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive