Search CORE

409 research outputs found

Learning Gaussian Graphical Models with Observed or Latent FVSs

Author: Liu Ying
Willsky Alan S.
Publication venue
Publication date: 09/11/2013
Field of study

Gaussian Graphical Models (GGMs) or Gauss Markov random fields are widely used in many applications, and the trade-off between the modeling capacity and the efficiency of learning and inference has been an important research problem. In this paper, we study the family of GGMs with small feedback vertex sets (FVSs), where an FVS is a set of nodes whose removal breaks all the cycles. Exact inference such as computing the marginal distributions and the partition function has complexity

O(k^{2}n)

using message-passing algorithms, where k is the size of the FVS, and n is the total number of nodes. We propose efficient structure learning algorithms for two cases: 1) All nodes are observed, which is useful in modeling social or flight networks where the FVS nodes often correspond to a small number of high-degree nodes, or hubs, while the rest of the networks is modeled by a tree. Regardless of the maximum degree, without knowing the full graph structure, we can exactly compute the maximum likelihood estimate in

O(kn^2+n^2\log n)

if the FVS is known or in polynomial time if the FVS is unknown but has bounded size. 2) The FVS nodes are latent variables, where structure learning is equivalent to decomposing a inverse covariance matrix (exactly or approximately) into the sum of a tree-structured matrix and a low-rank matrix. By incorporating efficient inference into the learning steps, we can obtain a learning algorithm using alternating low-rank correction with complexity

O(kn^{2}+n^{2}\log n)

per iteration. We also perform experiments using both synthetic data as well as real data of flight delays to demonstrate the modeling capacity with FVSs of various sizes

arXiv.org e-Print Archive

CiteSeerX

DSpace@MIT

Bayesian Nonparametric Hidden Semi-Markov Models

Author: Johnson Matthew J.
Willsky Alan S.
Publication venue
Publication date: 01/09/2012
Field of study

There is much interest in the Hierarchical Dirichlet Process Hidden Markov Model (HDP-HMM) as a natural Bayesian nonparametric extension of the ubiquitous Hidden Markov Model for learning from sequential and time-series data. However, in many settings the HDP-HMM's strict Markovian constraints are undesirable, particularly if we wish to learn or encode non-geometric state durations. We can extend the HDP-HMM to capture such structure by drawing upon explicit-duration semi-Markovianity, which has been developed mainly in the parametric frequentist setting, to allow construction of highly interpretable models that admit natural prior information on state durations. In this paper we introduce the explicit-duration Hierarchical Dirichlet Process Hidden semi-Markov Model (HDP-HSMM) and develop sampling algorithms for efficient posterior inference. The methods we introduce also provide new methods for sampling inference in the finite Bayesian HSMM. Our modular Gibbs sampling methods can be embedded in samplers for larger hierarchical Bayesian models, adding semi-Markov chain modeling as another tool in the Bayesian inference toolbox. We demonstrate the utility of the HDP-HSMM and our inference methods on both synthetic and real experiments

arXiv.org e-Print Archive

DSpace@MIT

Dirichlet Posterior Sampling with Truncated Multinomial Likelihoods

Author: Johnson Matthew James
Willsky Alan S.
Publication venue
Publication date: 01/01/2012
Field of study

We consider the problem of drawing samples from posterior distributions formed under a Dirichlet prior and a truncated multinomial likelihood, by which we mean a Multinomial likelihood function where we condition on one or more counts being zero a priori. Sampling this posterior distribution is of interest in inference algorithms for hierarchical Bayesian models based on the Dirichlet distribution or the Dirichlet process, particularly Gibbs sampling algorithms for the Hierarchical Dirichlet Process Hidden Semi-Markov Model. We provide a data augmentation sampling algorithm that is easy to implement, fast both to mix and to execute, and easily scalable to many dimensions. We demonstrate the algorithm's advantages over a generic Metropolis-Hastings sampling algorithm in several numerical experiments

arXiv.org e-Print Archive

CiteSeerX

Linear Dimensionality Reduction for Margin-Based Classification: High-Dimensional Data and Sensor Networks

Author: Varshney Kush R.
Willsky Alan
Willsky Alan S.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/12/2010
Field of study

Low-dimensional statistics of measurements play an important role in detection problems, including those encountered in sensor networks. In this work, we focus on learning low-dimensional linear statistics of high-dimensional measurement data along with decision rules defined in the low-dimensional space in the case when the probability density of the measurements and class labels is not given, but a training set of samples from this distribution is given. We pose a joint optimization problem for linear dimensionality reduction and margin-based classification, and develop a coordinate descent algorithm on the Stiefel manifold for its solution. Although the coordinate descent is not guaranteed to find the globally optimal solution, crucially, its alternating structure enables us to extend it for sensor networks with a message-passing approach requiring little communication. Linear dimensionality reduction prevents overfitting when learning from finite training data. In the sensor network setting, dimensionality reduction not only prevents overfitting, but also reduces power consumption due to communication. The learned reduced-dimensional space and decision rule is shown to be consistent and its Rademacher complexity is characterized. Experimental results are presented for a variety of datasets, including those from existing sensor networks, demonstrating the potential of our methodology in comparison with other dimensionality reduction approaches.National Science Foundation (U.S.). Graduate Research Fellowship ProgramUnited States. Army Research Office (MURI funded through ARO Grant W911NF-06-1-0076)United States. Air Force Office of Scientific Research (Award FA9550-06-1-0324)Shell International Exploration and Production B.V

DSpace@MIT

Crossref

Semidefinite descriptions of the convex hull of rotation matrices

Author: Parrilo Pablo A.
Saunderson James
Willsky Alan S.
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 19/03/2014
Field of study

We study the convex hull of

SO(n)

, thought of as the set of

n\times n

orthogonal matrices with unit determinant, from the point of view of semidefinite programming. We show that the convex hull of

SO(n)

is doubly spectrahedral, i.e. both it and its polar have a description as the intersection of a cone of positive semidefinite matrices with an affine subspace. Our spectrahedral representations are explicit, and are of minimum size, in the sense that there are no smaller spectrahedral representations of these convex bodies.Comment: 29 pages, 1 figur

arXiv.org e-Print Archive

CiteSeerX

DSpace@MIT

Crossref

Learning High-Dimensional Markov Forest Distributions: Analysis of Error Rates

Author: Anandkumar Animashree
Tan Vincent Y. F.
Willsky Alan S.
Publication venue
Publication date: 01/01/2010
Field of study

The problem of learning forest-structured discrete graphical models from i.i.d. samples is considered. An algorithm based on pruning of the Chow-Liu tree through adaptive thresholding is proposed. It is shown that this algorithm is both structurally consistent and risk consistent and the error probability of structure learning decays faster than any polynomial in the number of samples under fixed model size. For the high-dimensional scenario where the size of the model d and the number of edges k scale with the number of samples n, sufficient conditions on (n,d,k) are given for the algorithm to satisfy structural and risk consistencies. In addition, the extremal structures for learning are identified; we prove that the independent (resp. tree) model is the hardest (resp. easiest) to learn using the proposed algorithm in terms of error rates for structure learning.Comment: Accepted to the Journal of Machine Learning Research (Feb 2011

arXiv.org e-Print Archive

CiteSeerX

DSpace@MIT

Caltech Authors

Recursive FMP for distributed inference in Gaussian graphical models

Author: Liu Ying
Willsky Alan S.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/07/2013
Field of study

For inference in Gaussian graphical models with cycles, loopy belief propagation (LBP) performs well for some graphs, but often diverges or has slow convergence. When LBP does converge, the variance estimates are incorrect in general. The feedback message passing (FMP) algorithm has been proposed to enhance the convergence and accuracy of inference. In FMP, standard LBP is run twice on the subgraph excluding the pseudo-FVS (a set of nodes that breaks most crucial cycles) while nodes in the pseudo-FVS use a different protocol. In this paper, we propose recursive FMP, a purely distributed extension of FMP, where all nodes use the same message-passing protocol. An inference problem on the entire graph is recursively reduced to those on smaller subgraphs in a distributed manner. One advantage of this recursive approach compared with FMP is that there is only one active feedback node at a time, so centralized communication among feedback nodes can be turned into message broadcasting from the single feedback node. We characterize this algorithm using walk-sum analysis and provide theoretical results for convergence and accuracy. We also demonstrate the performance using both simulated models on grids and large-scale sea surface height anomaly data.United States. Air Force Office of Scientific Research (Grant FA9550-12-1-0287

DSpace@MIT

Crossref

High-Dimensional Gaussian Graphical Model Selection: Walk Summability and Local Separation Criterion

Author: Anandkumar Animashree
Tan Vincent Y. F.
Willsky Alan. S.
Publication venue
Publication date: 01/06/2011
Field of study

We consider the problem of high-dimensional Gaussian graphical model selection. We identify a set of graphs for which an efficient estimation algorithm exists, and this algorithm is based on thresholding of empirical conditional covariances. Under a set of transparent conditions, we establish structural consistency (or sparsistency) for the proposed algorithm, when the number of samples n=omega(J_{min}^{-2} log p), where p is the number of variables and J_{min} is the minimum (absolute) edge potential of the graphical model. The sufficient conditions for sparsistency are based on the notion of walk-summability of the model and the presence of sparse local vertex separators in the underlying graph. We also derive novel non-asymptotic necessary conditions on the number of samples required for sparsistency

arXiv.org e-Print Archive

CiteSeerX

DSpace@MIT