Search CORE

19,505 research outputs found

A Method of Moments for Mixture Models and Hidden Markov Models

Author: Anandkumar Animashree
Hsu Daniel
Kakade Sham M.
Publication venue
Publication date: 01/01/2012
Field of study

Mixture models are a fundamental tool in applied statistics and machine learning for treating data taken from multiple subpopulations. The current practice for estimating the parameters of such models relies on local search heuristics (e.g., the EM algorithm) which are prone to failure, and existing consistent methods are unfavorable due to their high computational and sample complexity which typically scale exponentially with the number of mixture components. This work develops an efficient method of moments approach to parameter estimation for a broad class of high-dimensional mixture models with many components, including multi-view mixtures of Gaussians (such as mixtures of axis-aligned Gaussians) and hidden Markov models. The new method leads to rigorous unsupervised learning results for mixture models that were not achieved by previous works; and, because of its simplicity, it offers a viable alternative to EM for practical deployment

arXiv.org e-Print Archive

CiteSeerX

Learning Arbitrary Statistical Mixtures of Discrete Distributions

Author: Anandkumar A.
Anandkumar Anima
Chaudhuri K.
Chaudhuri K.
Chaudhuri K.
Dasgupta S.
Dudley Richard M
Dudley Richard M
Hofmann T.
Hofmann T.
Rivlin Theodore J
Tomczak-Jaegermann Nicole
Publication venue
Publication date: 09/04/2015
Field of study

We study the problem of learning from unlabeled samples very general statistical mixture models on large finite sets. Specifically, the model to be learned,

\vartheta

, is a probability distribution over probability distributions

p

, where each such

p

is a probability distribution over

[n] = \{1,2,\dots,n\}

. When we sample from

\vartheta

, we do not observe

p

directly, but only indirectly and in very noisy fashion, by sampling from

[n]

repeatedly, independently

K

times from the distribution

p

. The problem is to infer

\vartheta

to high accuracy in transportation (earthmover) distance. We give the first efficient algorithms for learning this mixture model without making any restricting assumptions on the structure of the distribution

\vartheta

. We bound the quality of the solution as a function of the size of the samples

K

and the number of samples used. Our model and results have applications to a variety of unsupervised learning scenarios, including learning topic models and collaborative filtering.Comment: 23 pages. Preliminary version in the Proceeding of the 47th ACM Symposium on the Theory of Computing (STOC15

arXiv.org e-Print Archive

Crossref

Caltech Authors