409 research outputs found
Learning Gaussian Graphical Models with Observed or Latent FVSs
Gaussian Graphical Models (GGMs) or Gauss Markov random fields are widely
used in many applications, and the trade-off between the modeling capacity and
the efficiency of learning and inference has been an important research
problem. In this paper, we study the family of GGMs with small feedback vertex
sets (FVSs), where an FVS is a set of nodes whose removal breaks all the
cycles. Exact inference such as computing the marginal distributions and the
partition function has complexity using message-passing algorithms,
where k is the size of the FVS, and n is the total number of nodes. We propose
efficient structure learning algorithms for two cases: 1) All nodes are
observed, which is useful in modeling social or flight networks where the FVS
nodes often correspond to a small number of high-degree nodes, or hubs, while
the rest of the networks is modeled by a tree. Regardless of the maximum
degree, without knowing the full graph structure, we can exactly compute the
maximum likelihood estimate in if the FVS is known or in
polynomial time if the FVS is unknown but has bounded size. 2) The FVS nodes
are latent variables, where structure learning is equivalent to decomposing a
inverse covariance matrix (exactly or approximately) into the sum of a
tree-structured matrix and a low-rank matrix. By incorporating efficient
inference into the learning steps, we can obtain a learning algorithm using
alternating low-rank correction with complexity per
iteration. We also perform experiments using both synthetic data as well as
real data of flight delays to demonstrate the modeling capacity with FVSs of
various sizes
Bayesian Nonparametric Hidden Semi-Markov Models
There is much interest in the Hierarchical Dirichlet Process Hidden Markov
Model (HDP-HMM) as a natural Bayesian nonparametric extension of the ubiquitous
Hidden Markov Model for learning from sequential and time-series data. However,
in many settings the HDP-HMM's strict Markovian constraints are undesirable,
particularly if we wish to learn or encode non-geometric state durations. We
can extend the HDP-HMM to capture such structure by drawing upon
explicit-duration semi-Markovianity, which has been developed mainly in the
parametric frequentist setting, to allow construction of highly interpretable
models that admit natural prior information on state durations.
In this paper we introduce the explicit-duration Hierarchical Dirichlet
Process Hidden semi-Markov Model (HDP-HSMM) and develop sampling algorithms for
efficient posterior inference. The methods we introduce also provide new
methods for sampling inference in the finite Bayesian HSMM. Our modular Gibbs
sampling methods can be embedded in samplers for larger hierarchical Bayesian
models, adding semi-Markov chain modeling as another tool in the Bayesian
inference toolbox. We demonstrate the utility of the HDP-HSMM and our inference
methods on both synthetic and real experiments
Dirichlet Posterior Sampling with Truncated Multinomial Likelihoods
We consider the problem of drawing samples from posterior distributions
formed under a Dirichlet prior and a truncated multinomial likelihood, by which
we mean a Multinomial likelihood function where we condition on one or more
counts being zero a priori. Sampling this posterior distribution is of interest
in inference algorithms for hierarchical Bayesian models based on the Dirichlet
distribution or the Dirichlet process, particularly Gibbs sampling algorithms
for the Hierarchical Dirichlet Process Hidden Semi-Markov Model. We provide a
data augmentation sampling algorithm that is easy to implement, fast both to
mix and to execute, and easily scalable to many dimensions. We demonstrate the
algorithm's advantages over a generic Metropolis-Hastings sampling algorithm in
several numerical experiments
Linear Dimensionality Reduction for Margin-Based Classification: High-Dimensional Data and Sensor Networks
Low-dimensional statistics of measurements play an important role in detection problems, including those encountered in sensor networks. In this work, we focus on learning low-dimensional linear statistics of high-dimensional measurement data along with decision rules defined in the low-dimensional space in the case when the probability density of the measurements and class labels is not given, but a training set of samples from this distribution is given. We pose a joint optimization problem for linear dimensionality reduction and margin-based classification, and develop a coordinate descent algorithm on the Stiefel manifold for its solution. Although the coordinate descent is not guaranteed to find the globally optimal solution, crucially, its alternating structure enables us to extend it for sensor networks with a message-passing approach requiring little communication. Linear dimensionality reduction prevents overfitting when learning from finite training data. In the sensor network setting, dimensionality reduction not only prevents overfitting, but also reduces power consumption due to communication. The learned reduced-dimensional space and decision rule is shown to be consistent and its Rademacher complexity is characterized. Experimental results are presented for a variety of datasets, including those from existing sensor networks, demonstrating the potential of our methodology in comparison with other dimensionality reduction approaches.National Science Foundation (U.S.). Graduate Research Fellowship ProgramUnited States. Army Research Office (MURI funded through ARO Grant W911NF-06-1-0076)United States. Air Force Office of Scientific Research (Award FA9550-06-1-0324)Shell International Exploration and Production B.V
Semidefinite descriptions of the convex hull of rotation matrices
We study the convex hull of , thought of as the set of
orthogonal matrices with unit determinant, from the point of view of
semidefinite programming. We show that the convex hull of is doubly
spectrahedral, i.e. both it and its polar have a description as the
intersection of a cone of positive semidefinite matrices with an affine
subspace. Our spectrahedral representations are explicit, and are of minimum
size, in the sense that there are no smaller spectrahedral representations of
these convex bodies.Comment: 29 pages, 1 figur
Learning High-Dimensional Markov Forest Distributions: Analysis of Error Rates
The problem of learning forest-structured discrete graphical models from
i.i.d. samples is considered. An algorithm based on pruning of the Chow-Liu
tree through adaptive thresholding is proposed. It is shown that this algorithm
is both structurally consistent and risk consistent and the error probability
of structure learning decays faster than any polynomial in the number of
samples under fixed model size. For the high-dimensional scenario where the
size of the model d and the number of edges k scale with the number of samples
n, sufficient conditions on (n,d,k) are given for the algorithm to satisfy
structural and risk consistencies. In addition, the extremal structures for
learning are identified; we prove that the independent (resp. tree) model is
the hardest (resp. easiest) to learn using the proposed algorithm in terms of
error rates for structure learning.Comment: Accepted to the Journal of Machine Learning Research (Feb 2011
Recursive FMP for distributed inference in Gaussian graphical models
For inference in Gaussian graphical models with cycles, loopy belief propagation (LBP) performs well for some graphs, but often diverges or has slow convergence. When LBP does converge, the variance estimates are incorrect in general. The feedback message passing (FMP) algorithm has been proposed to enhance the convergence and accuracy of inference. In FMP, standard LBP is run twice on the subgraph excluding the pseudo-FVS (a set of nodes that breaks most crucial cycles) while nodes in the pseudo-FVS use a different protocol. In this paper, we propose recursive FMP, a purely distributed extension of FMP, where all nodes use the same message-passing protocol. An inference problem on the entire graph is recursively reduced to those on smaller subgraphs in a distributed manner. One advantage of this recursive approach compared with FMP is that there is only one active feedback node at a time, so centralized communication among feedback nodes can be turned into message broadcasting from the single feedback node. We characterize this algorithm using walk-sum analysis and provide theoretical results for convergence and accuracy. We also demonstrate the performance using both simulated models on grids and large-scale sea surface height anomaly data.United States. Air Force Office of Scientific Research (Grant FA9550-12-1-0287
High-Dimensional Gaussian Graphical Model Selection: Walk Summability and Local Separation Criterion
We consider the problem of high-dimensional Gaussian graphical model
selection. We identify a set of graphs for which an efficient estimation
algorithm exists, and this algorithm is based on thresholding of empirical
conditional covariances. Under a set of transparent conditions, we establish
structural consistency (or sparsistency) for the proposed algorithm, when the
number of samples n=omega(J_{min}^{-2} log p), where p is the number of
variables and J_{min} is the minimum (absolute) edge potential of the graphical
model. The sufficient conditions for sparsistency are based on the notion of
walk-summability of the model and the presence of sparse local vertex
separators in the underlying graph. We also derive novel non-asymptotic
necessary conditions on the number of samples required for sparsistency
- …