1,060 research outputs found
Learning a Factor Model via Regularized PCA
We consider the problem of learning a linear factor model. We propose a
regularized form of principal component analysis (PCA) and demonstrate through
experiments with synthetic and real data the superiority of resulting estimates
to those produced by pre-existing factor analysis approaches. We also establish
theoretical results that explain how our algorithm corrects the biases induced
by conventional approaches. An important feature of our algorithm is that its
computational requirements are similar to those of PCA, which enjoys wide use
in large part due to its efficiency
Efficient Learning of Sparse Conditional Random Fields for Supervised Sequence Labelling
Conditional Random Fields (CRFs) constitute a popular and efficient approach
for supervised sequence labelling. CRFs can cope with large description spaces
and can integrate some form of structural dependency between labels. In this
contribution, we address the issue of efficient feature selection for CRFs
based on imposing sparsity through an L1 penalty. We first show how sparsity of
the parameter set can be exploited to significantly speed up training and
labelling. We then introduce coordinate descent parameter update schemes for
CRFs with L1 regularization. We finally provide some empirical comparisons of
the proposed approach with state-of-the-art CRF training strategies. In
particular, it is shown that the proposed approach is able to take profit of
the sparsity to speed up processing and hence potentially handle larger
dimensional models
Brain covariance selection: better individual functional connectivity models using population prior
Spontaneous brain activity, as observed in functional neuroimaging, has been
shown to display reproducible structure that expresses brain architecture and
carries markers of brain pathologies. An important view of modern neuroscience
is that such large-scale structure of coherent activity reflects modularity
properties of brain connectivity graphs. However, to date, there has been no
demonstration that the limited and noisy data available in spontaneous activity
observations could be used to learn full-brain probabilistic models that
generalize to new data. Learning such models entails two main challenges: i)
modeling full brain connectivity is a difficult estimation problem that faces
the curse of dimensionality and ii) variability between subjects, coupled with
the variability of functional signals between experimental runs, makes the use
of multiple datasets challenging. We describe subject-level brain functional
connectivity structure as a multivariate Gaussian process and introduce a new
strategy to estimate it from group data, by imposing a common structure on the
graphical model in the population. We show that individual models learned from
functional Magnetic Resonance Imaging (fMRI) data using this population prior
generalize better to unseen data than models based on alternative
regularization schemes. To our knowledge, this is the first report of a
cross-validated model of spontaneous brain activity. Finally, we use the
estimated graphical model to explore the large-scale characteristics of
functional architecture and show for the first time that known cognitive
networks appear as the integrated communities of functional connectivity graph.Comment: in Advances in Neural Information Processing Systems, Vancouver :
Canada (2010
Discussion: The Dantzig selector: Statistical estimation when is much larger than
Discussion of ``The Dantzig selector: Statistical estimation when is much
larger than '' [math/0506081]Comment: Published in at http://dx.doi.org/10.1214/009053607000000442 the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Quasi-Likelihood and/or Robust Estimation in High Dimensions
We consider the theory for the high-dimensional generalized linear model with
the Lasso. After a short review on theoretical results in literature, we
present an extension of the oracle results to the case of quasi-likelihood
loss. We prove bounds for the prediction error and -error. The results
are derived under fourth moment conditions on the error distribution. The case
of robust loss is also given. We moreover show that under an irrepresentable
condition, the -penalized quasi-likelihood estimator has no false
positives.Comment: Published in at http://dx.doi.org/10.1214/12-STS397 the Statistical
Science (http://www.imstat.org/sts/) by the Institute of Mathematical
Statistics (http://www.imstat.org
- …