364 research outputs found
Alternating Direction Methods for Latent Variable Gaussian Graphical Model Selection
Chandrasekaran, Parrilo and Willsky (2010) proposed a convex optimization
problem to characterize graphical model selection in the presence of unobserved
variables. This convex optimization problem aims to estimate an inverse
covariance matrix that can be decomposed into a sparse matrix minus a low-rank
matrix from sample data. Solving this convex optimization problem is very
challenging, especially for large problems. In this paper, we propose two
alternating direction methods for solving this problem. The first method is to
apply the classical alternating direction method of multipliers to solve the
problem as a consensus problem. The second method is a proximal gradient based
alternating direction method of multipliers. Our methods exploit and take
advantage of the special structure of the problem and thus can solve large
problems very efficiently. Global convergence result is established for the
proposed methods. Numerical results on both synthetic data and gene expression
data show that our methods usually solve problems with one million variables in
one to two minutes, and are usually five to thirty five times faster than a
state-of-the-art Newton-CG proximal point algorithm
Network Inference via the Time-Varying Graphical Lasso
Many important problems can be modeled as a system of interconnected
entities, where each entity is recording time-dependent observations or
measurements. In order to spot trends, detect anomalies, and interpret the
temporal dynamics of such data, it is essential to understand the relationships
between the different entities and how these relationships evolve over time. In
this paper, we introduce the time-varying graphical lasso (TVGL), a method of
inferring time-varying networks from raw time series data. We cast the problem
in terms of estimating a sparse time-varying inverse covariance matrix, which
reveals a dynamic network of interdependencies between the entities. Since
dynamic network inference is a computationally expensive task, we derive a
scalable message-passing algorithm based on the Alternating Direction Method of
Multipliers (ADMM) to solve this problem in an efficient way. We also discuss
several extensions, including a streaming algorithm to update the model and
incorporate new observations in real time. Finally, we evaluate our TVGL
algorithm on both real and synthetic datasets, obtaining interpretable results
and outperforming state-of-the-art baselines in terms of both accuracy and
scalability
Global Analysis of Gene Expression and Projection Target Correlations in the Mouse Brain
Recent studies have shown that projection targets in the mouse neocortex are correlated with their gene expression patterns. However, a brain-wide quantitative analysis of the relationship between voxel genetic composition and their projection targets is lacking to date. Here we extended those studies to perform a global, integrative analysis of gene expression and projection target correlations in the mouse brain. By using the Allen Brain Atlas data, we analyzed the relationship between gene expression and projection targets. We first visualized and clustered the two data sets separately and showed that they both exhibit strong spatial autocorrelation. Building upon this initial analysis, we conducted an integrative correlation analysis of the two data sets while correcting for their spatial autocorrelation. This resulted in a correlation of 0.19 with significant p value. We further identified the top genes responsible for this correlation using two greedy gene ranking techniques. Using only the top genes identified by those techniques, we recomputed the correlation between these two data sets. This led to correlation values up to 0.49 with significant p values. Our results illustrated that although the target specificity of neurons is in fact complex and diverse, yet they are strongly affected by their genetic and molecular compositions
Learning the hub graphical Lasso model with the structured sparsity via an efficient algorithm
Graphical models have exhibited their performance in numerous tasks ranging
from biological analysis to recommender systems. However, graphical models with
hub nodes are computationally difficult to fit, particularly when the dimension
of the data is large. To efficiently estimate the hub graphical models, we
introduce a two-phase algorithm. The proposed algorithm first generates a good
initial point via a dual alternating direction method of multipliers (ADMM),
and then warm starts a semismooth Newton (SSN) based augmented Lagrangian
method (ALM) to compute a solution that is accurate enough for practical tasks.
The sparsity structure of the generalized Jacobian ensures that the algorithm
can obtain a nice solution very efficiently. Comprehensive experiments on both
synthetic data and real data show that it obviously outperforms the existing
state-of-the-art algorithms. In particular, in some high dimensional tasks, it
can save more than 70\% of the execution time, meanwhile still achieves a
high-quality estimation.Comment: 28 pages,3 figure
- …