11,219 research outputs found
A Method of Moments for Mixture Models and Hidden Markov Models
Mixture models are a fundamental tool in applied statistics and machine
learning for treating data taken from multiple subpopulations. The current
practice for estimating the parameters of such models relies on local search
heuristics (e.g., the EM algorithm) which are prone to failure, and existing
consistent methods are unfavorable due to their high computational and sample
complexity which typically scale exponentially with the number of mixture
components. This work develops an efficient method of moments approach to
parameter estimation for a broad class of high-dimensional mixture models with
many components, including multi-view mixtures of Gaussians (such as mixtures
of axis-aligned Gaussians) and hidden Markov models. The new method leads to
rigorous unsupervised learning results for mixture models that were not
achieved by previous works; and, because of its simplicity, it offers a viable
alternative to EM for practical deployment
Tensor decompositions for learning latent variable models
This work considers a computationally and statistically efficient parameter
estimation method for a wide class of latent variable models---including
Gaussian mixture models, hidden Markov models, and latent Dirichlet
allocation---which exploits a certain tensor structure in their low-order
observable moments (typically, of second- and third-order). Specifically,
parameter estimation is reduced to the problem of extracting a certain
(orthogonal) decomposition of a symmetric tensor derived from the moments; this
decomposition can be viewed as a natural generalization of the singular value
decomposition for matrices. Although tensor decompositions are generally
intractable to compute, the decomposition of these specially structured tensors
can be efficiently obtained by a variety of approaches, including power
iterations and maximization approaches (similar to the case of matrices). A
detailed analysis of a robust tensor power method is provided, establishing an
analogue of Wedin's perturbation theorem for the singular vectors of matrices.
This implies a robust and computationally tractable estimation approach for
several popular latent variable models
Geometric Learning of Hidden Markov Models via a Method of Moments Algorithm
We present a novel algorithm for learning the parameters of hidden Markov
models (HMMs) in a geometric setting where the observations take values in
Riemannian manifolds. In particular, we elevate a recent second-order method of
moments algorithm that incorporates non-consecutive correlations to a more
general setting where observations take place in a Riemannian symmetric space
of non-positive curvature and the observation likelihoods are Riemannian
Gaussians. The resulting algorithm decouples into a Riemannian Gaussian mixture
model estimation algorithm followed by a sequence of convex optimization
procedures. We demonstrate through examples that the learner can result in
significantly improved speed and numerical accuracy compared to existing
learners
Introduction to finite mixtures
Mixture models have been around for over 150 years, as an intuitively simple
and practical tool for enriching the collection of probability distributions
available for modelling data. In this chapter we describe the basic ideas of
the subject, present several alternative representations and perspectives on
these models, and discuss some of the elements of inference about the unknowns
in the models. Our focus is on the simplest set-up, of finite mixture models,
but we discuss also how various simplifying assumptions can be relaxed to
generate the rich landscape of modelling and inference ideas traversed in the
rest of this book.Comment: 14 pages, 7 figures, A chapter prepared for the forthcoming Handbook
of Mixture Analysis. V2 corrects a small but important typographical error,
and makes other minor edits; V3 makes further minor corrections and updates
following review; V4 corrects algorithmic details in sec 4.1 and 4.2, and
removes typo
A Spectral Algorithm for Latent Dirichlet Allocation
The problem of topic modeling can be seen as a generalization of the
clustering problem, in that it posits that observations are generated due to
multiple latent factors (e.g., the words in each document are generated as a
mixture of several active topics, as opposed to just one). This increased
representational power comes at the cost of a more challenging unsupervised
learning problem of estimating the topic probability vectors (the distributions
over words for each topic), when only the words are observed and the
corresponding topics are hidden.
We provide a simple and efficient learning procedure that is guaranteed to
recover the parameters for a wide class of mixture models, including the
popular latent Dirichlet allocation (LDA) model. For LDA, the procedure
correctly recovers both the topic probability vectors and the prior over the
topics, using only trigram statistics (i.e., third order moments, which may be
estimated with documents containing just three words). The method, termed
Excess Correlation Analysis (ECA), is based on a spectral decomposition of low
order moments (third and fourth order) via two singular value decompositions
(SVDs). Moreover, the algorithm is scalable since the SVD operations are
carried out on matrices, where is the number of latent factors
(e.g. the number of topics), rather than in the -dimensional observed space
(typically ).Comment: Changed title to match conference version, which appears in Advances
in Neural Information Processing Systems 25, 201
Bayesian Portfolio Selection in a Markov Switching Gaussian Mixture Model
Departure from normality poses implementation barriers to the Markowitz mean-variance portfolio selection. When assets are affected by common and idiosyncratic shocks, the distribution of asset returns may exhibit Markov switching regimes and have a Gaussian mixture distribution conditional on each regime. The model is estimated in a Bayesian framework using the Gibbs sampler. An application to the global portfolio diversification is also discussed.Portfolio; Bayesian; Hidden Markov Model; Gaussian Mixture
Spectral Sequence Motif Discovery
Sequence discovery tools play a central role in several fields of
computational biology. In the framework of Transcription Factor binding
studies, motif finding algorithms of increasingly high performance are required
to process the big datasets produced by new high-throughput sequencing
technologies. Most existing algorithms are computationally demanding and often
cannot support the large size of new experimental data. We present a new motif
discovery algorithm that is built on a recent machine learning technique,
referred to as Method of Moments. Based on spectral decompositions, this method
is robust under model misspecification and is not prone to locally optimal
solutions. We obtain an algorithm that is extremely fast and designed for the
analysis of big sequencing data. In a few minutes, we can process datasets of
hundreds of thousand sequences and extract motif profiles that match those
computed by various state-of-the-art algorithms.Comment: 20 pages, 3 figures, 1 tabl
- …