5,395 research outputs found
Sparse permutation invariant covariance estimation
The paper proposes a method for constructing a sparse estimator for the
inverse covariance (concentration) matrix in high-dimensional settings. The
estimator uses a penalized normal likelihood approach and forces sparsity by
using a lasso-type penalty. We establish a rate of convergence in the Frobenius
norm as both data dimension and sample size are allowed to grow, and
show that the rate depends explicitly on how sparse the true concentration
matrix is. We also show that a correlation-based version of the method exhibits
better rates in the operator norm. We also derive a fast iterative algorithm
for computing the estimator, which relies on the popular Cholesky decomposition
of the inverse but produces a permutation-invariant estimator. The method is
compared to other estimators on simulated data and on a real data example of
tumor tissue classification using gene expression data.Comment: Published in at http://dx.doi.org/10.1214/08-EJS176 the Electronic
Journal of Statistics (http://www.i-journals.org/ejs/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Brain covariance selection: better individual functional connectivity models using population prior
Spontaneous brain activity, as observed in functional neuroimaging, has been
shown to display reproducible structure that expresses brain architecture and
carries markers of brain pathologies. An important view of modern neuroscience
is that such large-scale structure of coherent activity reflects modularity
properties of brain connectivity graphs. However, to date, there has been no
demonstration that the limited and noisy data available in spontaneous activity
observations could be used to learn full-brain probabilistic models that
generalize to new data. Learning such models entails two main challenges: i)
modeling full brain connectivity is a difficult estimation problem that faces
the curse of dimensionality and ii) variability between subjects, coupled with
the variability of functional signals between experimental runs, makes the use
of multiple datasets challenging. We describe subject-level brain functional
connectivity structure as a multivariate Gaussian process and introduce a new
strategy to estimate it from group data, by imposing a common structure on the
graphical model in the population. We show that individual models learned from
functional Magnetic Resonance Imaging (fMRI) data using this population prior
generalize better to unseen data than models based on alternative
regularization schemes. To our knowledge, this is the first report of a
cross-validated model of spontaneous brain activity. Finally, we use the
estimated graphical model to explore the large-scale characteristics of
functional architecture and show for the first time that known cognitive
networks appear as the integrated communities of functional connectivity graph.Comment: in Advances in Neural Information Processing Systems, Vancouver :
Canada (2010
Randomized Dimension Reduction on Massive Data
Scalability of statistical estimators is of increasing importance in modern
applications and dimension reduction is often used to extract relevant
information from data. A variety of popular dimension reduction approaches can
be framed as symmetric generalized eigendecomposition problems. In this paper
we outline how taking into account the low rank structure assumption implicit
in these dimension reduction approaches provides both computational and
statistical advantages. We adapt recent randomized low-rank approximation
algorithms to provide efficient solutions to three dimension reduction methods:
Principal Component Analysis (PCA), Sliced Inverse Regression (SIR), and
Localized Sliced Inverse Regression (LSIR). A key observation in this paper is
that randomization serves a dual role, improving both computational and
statistical performance. This point is highlighted in our experiments on real
and simulated data.Comment: 31 pages, 6 figures, Key Words:dimension reduction, generalized
eigendecompositon, low-rank, supervised, inverse regression, random
projections, randomized algorithms, Krylov subspace method
- …