31 research outputs found
Sparse Inverse Covariance Estimation for Chordal Structures
In this paper, we consider the Graphical Lasso (GL), a popular optimization
problem for learning the sparse representations of high-dimensional datasets,
which is well-known to be computationally expensive for large-scale problems.
Recently, we have shown that the sparsity pattern of the optimal solution of GL
is equivalent to the one obtained from simply thresholding the sample
covariance matrix, for sparse graphs under different conditions. We have also
derived a closed-form solution that is optimal when the thresholded sample
covariance matrix has an acyclic structure. As a major generalization of the
previous result, in this paper we derive a closed-form solution for the GL for
graphs with chordal structures. We show that the GL and thresholding
equivalence conditions can significantly be simplified and are expected to hold
for high-dimensional problems if the thresholded sample covariance matrix has a
chordal structure. We then show that the GL and thresholding equivalence is
enough to reduce the GL to a maximum determinant matrix completion problem and
drive a recursive closed-form solution for the GL when the thresholded sample
covariance matrix has a chordal structure. For large-scale problems with up to
450 million variables, the proposed method can solve the GL problem in less
than 2 minutes, while the state-of-the-art methods converge in more than 2
hours
Multivariate Generalized Gaussian Distribution: Convexity and Graphical Models
We consider covariance estimation in the multivariate generalized Gaussian
distribution (MGGD) and elliptically symmetric (ES) distribution. The maximum
likelihood optimization associated with this problem is non-convex, yet it has
been proved that its global solution can be often computed via simple fixed
point iterations. Our first contribution is a new analysis of this likelihood
based on geodesic convexity that requires weaker assumptions. Our second
contribution is a generalized framework for structured covariance estimation
under sparsity constraints. We show that the optimizations can be formulated as
convex minimization as long the MGGD shape parameter is larger than half and
the sparsity pattern is chordal. These include, for example, maximum likelihood
estimation of banded inverse covariances in multivariate Laplace distributions,
which are associated with time varying autoregressive processes
Sparse approximations in spatio-temporal point-process models
Analysis of spatio-temporal point patterns plays an important role in several disci-plines, yet inference in these systems remains computationally challenging due to the high resolution modelling generally required by large data sets and the analytically in-tractable likelihood function. Here, we exploit the sparsity structure of a fully-discretised log-Gaussian Cox process model by using expectation constrained approximate inference. The resulting family of expectation propagation algorithms scale well with the state di-mension and the length of the temporal horizon with moderate loss in distributional accu-racy. They hence provide a flexible and faster alternative to both the filtering-smoothing type algorithms and the approaches which implement the Laplace method or expectation propagation on (block) sparse latent Gaussian models. We demonstrate the use of the proposed method in the reconstruction of conflict intensity levels in Afghanistan from a WikiLeaks data set
GMRES-Accelerated ADMM for Quadratic Objectives
We consider the sequence acceleration problem for the alternating direction
method-of-multipliers (ADMM) applied to a class of equality-constrained
problems with strongly convex quadratic objectives, which frequently arise as
the Newton subproblem of interior-point methods. Within this context, the ADMM
update equations are linear, the iterates are confined within a Krylov
subspace, and the General Minimum RESidual (GMRES) algorithm is optimal in its
ability to accelerate convergence. The basic ADMM method solves a
-conditioned problem in iterations. We give
theoretical justification and numerical evidence that the GMRES-accelerated
variant consistently solves the same problem in iterations
for an order-of-magnitude reduction in iterations, despite a worst-case bound
of iterations. The method is shown to be competitive against
standard preconditioned Krylov subspace methods for saddle-point problems. The
method is embedded within SeDuMi, a popular open-source solver for conic
optimization written in MATLAB, and used to solve many large-scale semidefinite
programs with error that decreases like , instead of ,
where is the iteration index.Comment: 31 pages, 7 figures. Accepted for publication in SIAM Journal on
Optimization (SIOPT
Conservative Sparsification for Efficient Approximate Estimation
Linear Gaussian systems often exhibit sparse structures. For systems which grow as a function of time, marginalisation of past states will eventually introduce extra non-zero elements into the information matrix of the Gaussian distribution. These extra non-zeros can lead to dense problems as these systems progress through time. This thesis proposes a method that can delete elements of the information matrix while maintaining guarantees about the conservativeness of the resulting estimate with a computational complexity that is a function of the connectivity of the graph rather than the problem dimension. This sparsification can be performed iteratively and minimises the Kullback Leibler Divergence (KLD) between the original and approximate distributions. This new technique is called Conservative Sparsification (CS). For large sparse graphs employing a Junction Tree (JT) for estimation, efficiency is related to the size of the largest clique. Conservative Sparsification can be applied to clique splitting in JTs, enabling approximate and efficient estimation in JTs with the same conservative guarantees as CS for information matrices. In distributed estimation scenarios which use JTs, CS can be performed in parallel and asynchronously on JT cliques. This approach usually results in a larger KLD compared with the optimal CS approach, but an upper bound on this increased divergence can be calculated with information locally available to each clique. This work has applications in large scale distributed linear estimation problems where the size of the problem or communication overheads make optimal linear estimation difficult