8,286 research outputs found
Spectral Methods for Learning Multivariate Latent Tree Structure
This work considers the problem of learning the structure of multivariate
linear tree models, which include a variety of directed tree graphical models
with continuous, discrete, and mixed latent variables such as linear-Gaussian
models, hidden Markov models, Gaussian mixture models, and Markov evolutionary
trees. The setting is one where we only have samples from certain observed
variables in the tree, and our goal is to estimate the tree structure (i.e.,
the graph of how the underlying hidden variables are connected to each other
and to the observed variables). We propose the Spectral Recursive Grouping
algorithm, an efficient and simple bottom-up procedure for recovering the tree
structure from independent samples of the observed variables. Our finite sample
size bounds for exact recovery of the tree structure reveal certain natural
dependencies on underlying statistical and structural properties of the
underlying joint distribution. Furthermore, our sample complexity guarantees
have no explicit dependence on the dimensionality of the observed variables,
making the algorithm applicable to many high-dimensional settings. At the heart
of our algorithm is a spectral quartet test for determining the relative
topology of a quartet of variables from second-order statistics
Unfolding Latent Tree Structures using 4th Order Tensors
Discovering the latent structure from many observed variables is an important
yet challenging learning task. Existing approaches for discovering latent
structures often require the unknown number of hidden states as an input. In
this paper, we propose a quartet based approach which is \emph{agnostic} to
this number. The key contribution is a novel rank characterization of the
tensor associated with the marginal distribution of a quartet. This
characterization allows us to design a \emph{nuclear norm} based test for
resolving quartet relations. We then use the quartet test as a subroutine in a
divide-and-conquer algorithm for recovering the latent tree structure. Under
mild conditions, the algorithm is consistent and its error probability decays
exponentially with increasing sample size. We demonstrate that the proposed
approach compares favorably to alternatives. In a real world stock dataset, it
also discovers meaningful groupings of variables, and produces a model that
fits the data better
Learning loopy graphical models with latent variables: Efficient methods and guarantees
The problem of structure estimation in graphical models with latent variables
is considered. We characterize conditions for tractable graph estimation and
develop efficient methods with provable guarantees. We consider models where
the underlying Markov graph is locally tree-like, and the model is in the
regime of correlation decay. For the special case of the Ising model, the
number of samples required for structural consistency of our method scales
as , where p is the
number of variables, is the minimum edge potential, is
the depth (i.e., distance from a hidden node to the nearest observed nodes),
and is a parameter which depends on the bounds on node and edge
potentials in the Ising model. Necessary conditions for structural consistency
under any algorithm are derived and our method nearly matches the lower bound
on sample requirements. Further, the proposed method is practical to implement
and provides flexibility to control the number of latent variables and the
cycle lengths in the output graph.Comment: Published in at http://dx.doi.org/10.1214/12-AOS1070 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
- …