788 research outputs found
A Spectral Algorithm for Learning Hidden Markov Models
Hidden Markov Models (HMMs) are one of the most fundamental and widely used
statistical tools for modeling discrete time series. In general, learning HMMs
from data is computationally hard (under cryptographic assumptions), and
practitioners typically resort to search heuristics which suffer from the usual
local optima issues. We prove that under a natural separation condition (bounds
on the smallest singular value of the HMM parameters), there is an efficient
and provably correct algorithm for learning HMMs. The sample complexity of the
algorithm does not explicitly depend on the number of distinct (discrete)
observations---it implicitly depends on this quantity through spectral
properties of the underlying HMM. This makes the algorithm particularly
applicable to settings with a large number of observations, such as those in
natural language processing where the space of observation is sometimes the
words in a language. The algorithm is also simple, employing only a singular
value decomposition and matrix multiplications.Comment: Published in JCSS Special Issue "Learning Theory 2009
Identifiability and Unmixing of Latent Parse Trees
This paper explores unsupervised learning of parsing models along two
directions. First, which models are identifiable from infinite data? We use a
general technique for numerically checking identifiability based on the rank of
a Jacobian matrix, and apply it to several standard constituency and dependency
parsing models. Second, for identifiable models, how do we estimate the
parameters efficiently? EM suffers from local optima, while recent work using
spectral methods cannot be directly applied since the topology of the parse
tree varies across sentences. We develop a strategy, unmixing, which deals with
this additional complexity for restricted classes of parsing models
A Method of Moments for Mixture Models and Hidden Markov Models
Mixture models are a fundamental tool in applied statistics and machine
learning for treating data taken from multiple subpopulations. The current
practice for estimating the parameters of such models relies on local search
heuristics (e.g., the EM algorithm) which are prone to failure, and existing
consistent methods are unfavorable due to their high computational and sample
complexity which typically scale exponentially with the number of mixture
components. This work develops an efficient method of moments approach to
parameter estimation for a broad class of high-dimensional mixture models with
many components, including multi-view mixtures of Gaussians (such as mixtures
of axis-aligned Gaussians) and hidden Markov models. The new method leads to
rigorous unsupervised learning results for mixture models that were not
achieved by previous works; and, because of its simplicity, it offers a viable
alternative to EM for practical deployment
- …