Search CORE

116 research outputs found

A Spectral Algorithm for Learning Hidden Markov Models

Author: Hsu Daniel
Kakade Sham M.
Zhang Tong
Publication venue
Publication date: 01/01/2012
Field of study

Hidden Markov Models (HMMs) are one of the most fundamental and widely used statistical tools for modeling discrete time series. In general, learning HMMs from data is computationally hard (under cryptographic assumptions), and practitioners typically resort to search heuristics which suffer from the usual local optima issues. We prove that under a natural separation condition (bounds on the smallest singular value of the HMM parameters), there is an efficient and provably correct algorithm for learning HMMs. The sample complexity of the algorithm does not explicitly depend on the number of distinct (discrete) observations---it implicitly depends on this quantity through spectral properties of the underlying HMM. This makes the algorithm particularly applicable to settings with a large number of observations, such as those in natural language processing where the space of observation is sometimes the words in a language. The algorithm is also simple, employing only a singular value decomposition and matrix multiplications.Comment: Published in JCSS Special Issue "Learning Theory 2009

arXiv.org e-Print Archive

CiteSeerX

Elsevier - Publisher Connector

ScholarlyCommons@Penn

Uniform Chernoff and Dvoretzky-Kiefer-Wolfowitz-type inequalities for Markov chains and related processes

Author: Kontorovich Aryeh
Weiss Roi
Publication venue
Publication date: 01/01/2013
Field of study

We observe that the technique of Markov contraction can be used to establish measure concentration for a broad class of non-contracting chains. In particular, geometric ergodicity provides a simple and versatile framework. This leads to a short, elementary proof of a general concentration inequality for Markov and hidden Markov chains (HMM), which supercedes some of the known results and easily extends to other processes such as Markov trees. As applications, we give a Dvoretzky-Kiefer-Wolfowitz-type inequality and a uniform Chernoff bound. All of our bounds are dimension-free and hold for countably infinite state spaces

arXiv.org e-Print Archive

CiteSeerX

A Method of Moments for Mixture Models and Hidden Markov Models

Author: Anandkumar Animashree
Hsu Daniel
Kakade Sham M.
Publication venue
Publication date: 01/01/2012
Field of study

Mixture models are a fundamental tool in applied statistics and machine learning for treating data taken from multiple subpopulations. The current practice for estimating the parameters of such models relies on local search heuristics (e.g., the EM algorithm) which are prone to failure, and existing consistent methods are unfavorable due to their high computational and sample complexity which typically scale exponentially with the number of mixture components. This work develops an efficient method of moments approach to parameter estimation for a broad class of high-dimensional mixture models with many components, including multi-view mixtures of Gaussians (such as mixtures of axis-aligned Gaussians) and hidden Markov models. The new method leads to rigorous unsupervised learning results for mixture models that were not achieved by previous works; and, because of its simplicity, it offers a viable alternative to EM for practical deployment

arXiv.org e-Print Archive

CiteSeerX