Search CORE

8,286 research outputs found

Spectral Methods for Learning Multivariate Latent Tree Structure

Author: Anandkumar Animashree
Chaudhuri Kamalika
Hsu Daniel
Kakade Sham M.
Song Le
Zhang Tong
Publication venue
Publication date: 01/01/2011
Field of study

This work considers the problem of learning the structure of multivariate linear tree models, which include a variety of directed tree graphical models with continuous, discrete, and mixed latent variables such as linear-Gaussian models, hidden Markov models, Gaussian mixture models, and Markov evolutionary trees. The setting is one where we only have samples from certain observed variables in the tree, and our goal is to estimate the tree structure (i.e., the graph of how the underlying hidden variables are connected to each other and to the observed variables). We propose the Spectral Recursive Grouping algorithm, an efficient and simple bottom-up procedure for recovering the tree structure from independent samples of the observed variables. Our finite sample size bounds for exact recovery of the tree structure reveal certain natural dependencies on underlying statistical and structural properties of the underlying joint distribution. Furthermore, our sample complexity guarantees have no explicit dependence on the dimensionality of the observed variables, making the algorithm applicable to many high-dimensional settings. At the heart of our algorithm is a spectral quartet test for determining the relative topology of a quartet of variables from second-order statistics

arXiv.org e-Print Archive

CiteSeerX

Unfolding Latent Tree Structures using 4th Order Tensors

Author: Ishteva Mariya
Park Haesun
Song Le
Publication venue
Publication date: 03/10/2012
Field of study

Discovering the latent structure from many observed variables is an important yet challenging learning task. Existing approaches for discovering latent structures often require the unknown number of hidden states as an input. In this paper, we propose a quartet based approach which is \emph{agnostic} to this number. The key contribution is a novel rank characterization of the tensor associated with the marginal distribution of a quartet. This characterization allows us to design a \emph{nuclear norm} based test for resolving quartet relations. We then use the quartet test as a subroutine in a divide-and-conquer algorithm for recovering the latent tree structure. Under mild conditions, the algorithm is consistent and its error probability decays exponentially with increasing sample size. We demonstrate that the proposed approach compares favorably to alternatives. In a real world stock dataset, it also discovers meaningful groupings of variables, and produces a model that fits the data better

arXiv.org e-Print Archive

CiteSeerX

Learning loopy graphical models with latent variables: Efficient methods and guarantees

Author: Anandkumar Animashree
Valluvan Ragupathyraj
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2013
Field of study

The problem of structure estimation in graphical models with latent variables is considered. We characterize conditions for tractable graph estimation and develop efficient methods with provable guarantees. We consider models where the underlying Markov graph is locally tree-like, and the model is in the regime of correlation decay. For the special case of the Ising model, the number of samples

n

required for structural consistency of our method scales as

n=\Omega(\theta_{\min}^{-\delta\eta(\eta+1)-2}\log p)

, where p is the number of variables,

\theta_{\min}

is the minimum edge potential,

\delta

is the depth (i.e., distance from a hidden node to the nearest observed nodes), and

\eta

is a parameter which depends on the bounds on node and edge potentials in the Ising model. Necessary conditions for structural consistency under any algorithm are derived and our method nearly matches the lower bound on sample requirements. Further, the proposed method is practical to implement and provides flexibility to control the number of latent variables and the cycle lengths in the output graph.Comment: Published in at http://dx.doi.org/10.1214/12-AOS1070 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

Caltech Authors