1,060 research outputs found

    Learning a Factor Model via Regularized PCA

    Full text link
    We consider the problem of learning a linear factor model. We propose a regularized form of principal component analysis (PCA) and demonstrate through experiments with synthetic and real data the superiority of resulting estimates to those produced by pre-existing factor analysis approaches. We also establish theoretical results that explain how our algorithm corrects the biases induced by conventional approaches. An important feature of our algorithm is that its computational requirements are similar to those of PCA, which enjoys wide use in large part due to its efficiency

    Efficient Learning of Sparse Conditional Random Fields for Supervised Sequence Labelling

    Full text link
    Conditional Random Fields (CRFs) constitute a popular and efficient approach for supervised sequence labelling. CRFs can cope with large description spaces and can integrate some form of structural dependency between labels. In this contribution, we address the issue of efficient feature selection for CRFs based on imposing sparsity through an L1 penalty. We first show how sparsity of the parameter set can be exploited to significantly speed up training and labelling. We then introduce coordinate descent parameter update schemes for CRFs with L1 regularization. We finally provide some empirical comparisons of the proposed approach with state-of-the-art CRF training strategies. In particular, it is shown that the proposed approach is able to take profit of the sparsity to speed up processing and hence potentially handle larger dimensional models

    Brain covariance selection: better individual functional connectivity models using population prior

    Get PDF
    Spontaneous brain activity, as observed in functional neuroimaging, has been shown to display reproducible structure that expresses brain architecture and carries markers of brain pathologies. An important view of modern neuroscience is that such large-scale structure of coherent activity reflects modularity properties of brain connectivity graphs. However, to date, there has been no demonstration that the limited and noisy data available in spontaneous activity observations could be used to learn full-brain probabilistic models that generalize to new data. Learning such models entails two main challenges: i) modeling full brain connectivity is a difficult estimation problem that faces the curse of dimensionality and ii) variability between subjects, coupled with the variability of functional signals between experimental runs, makes the use of multiple datasets challenging. We describe subject-level brain functional connectivity structure as a multivariate Gaussian process and introduce a new strategy to estimate it from group data, by imposing a common structure on the graphical model in the population. We show that individual models learned from functional Magnetic Resonance Imaging (fMRI) data using this population prior generalize better to unseen data than models based on alternative regularization schemes. To our knowledge, this is the first report of a cross-validated model of spontaneous brain activity. Finally, we use the estimated graphical model to explore the large-scale characteristics of functional architecture and show for the first time that known cognitive networks appear as the integrated communities of functional connectivity graph.Comment: in Advances in Neural Information Processing Systems, Vancouver : Canada (2010

    Discussion: The Dantzig selector: Statistical estimation when pp is much larger than nn

    Get PDF
    Discussion of ``The Dantzig selector: Statistical estimation when pp is much larger than nn'' [math/0506081]Comment: Published in at http://dx.doi.org/10.1214/009053607000000442 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Quasi-Likelihood and/or Robust Estimation in High Dimensions

    Full text link
    We consider the theory for the high-dimensional generalized linear model with the Lasso. After a short review on theoretical results in literature, we present an extension of the oracle results to the case of quasi-likelihood loss. We prove bounds for the prediction error and â„“1\ell_1-error. The results are derived under fourth moment conditions on the error distribution. The case of robust loss is also given. We moreover show that under an irrepresentable condition, the â„“1\ell_1-penalized quasi-likelihood estimator has no false positives.Comment: Published in at http://dx.doi.org/10.1214/12-STS397 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org
    • …
    corecore