Search CORE

453 research outputs found

The Graphical Lasso: New Insights and Alternatives

Author: Hastie Trevor
Mazumder Rahul
Publication venue
Publication date: 01/01/2012
Field of study

The graphical lasso \citep{FHT2007a} is an algorithm for learning the structure in an undirected Gaussian graphical model, using

\ell_1

regularization to control the number of zeros in the precision matrix {\B\Theta}={\B\Sigma}^{-1} \citep{BGA2008,yuan_lin_07}. The {\texttt R} package \GL\ \citep{FHT2007a} is popular, fast, and allows one to efficiently build a path of models for different values of the tuning parameter. Convergence of \GL\ can be tricky; the converged precision matrix might not be the inverse of the estimated covariance, and occasionally it fails to converge with warm starts. In this paper we explain this behavior, and propose new algorithms that appear to outperform \GL. By studying the "normal equations" we see that, \GL\ is solving the {\em dual} of the graphical lasso penalized likelihood, by block coordinate ascent; a result which can also be found in \cite{BGA2008}. In this dual, the target of estimation is \B\Sigma, the covariance matrix, rather than the precision matrix \B\Theta. We propose similar primal algorithms \PGL\ and \DPGL, that also operate by block-coordinate descent, where \B\Theta is the optimization target. We study all of these algorithms, and in particular different approaches to solving their coordinate sub-problems. We conclude that \DPGL\ is superior from several points of view.Comment: This is a revised version of our previous manuscript with the same name ArXiv id: http://arxiv.org/abs/1111.547

arXiv.org e-Print Archive

CiteSeerX

DSpace@MIT

Crossref

Local case-control sampling: Efficient subsampling in imbalanced data sets

Author: Fithian William
Hastie Trevor
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2014
Field of study

For classification problems with significant class imbalance, subsampling can reduce computational costs at the price of inflated variance in estimating model parameters. We propose a method for subsampling efficiently for logistic regression by adjusting the class balance locally in feature space via an accept-reject scheme. Our method generalizes standard case-control sampling, using a pilot estimate to preferentially select examples whose responses are conditionally rare given their features. The biased subsampling is corrected by a post-hoc analytic adjustment to the parameters. The method is simple and requires one parallelizable scan over the full data set. Standard case-control sampling is inconsistent under model misspecification for the population risk-minimizing coefficients

\theta^*

. By contrast, our estimator is consistent for

\theta^*

provided that the pilot estimate is. Moreover, under correct specification and with a consistent, independent pilot estimate, our estimator has exactly twice the asymptotic variance of the full-sample MLE - even if the selected subsample comprises a miniscule fraction of the full data set, as happens when the original data are severely imbalanced. The factor of two improves to

1+\frac{1}{c}

if we multiply the baseline acceptance probabilities by

c>1

(and weight points with acceptance probability greater than 1), taking roughly

\frac{1+c}{2}

times as many data points into the subsample. Experiments on simulated and real data show that our method can substantially outperform standard case-control subsampling.Comment: Published in at http://dx.doi.org/10.1214/14-AOS1220 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

Sparse inverse covariance estimation with the lasso

Author: Friedman Jerome
Hastie Trevor
Tibshirani Robert
Publication venue
Publication date: 26/08/2007
Field of study

We consider the problem of estimating sparse graphs by a lasso penalty applied to the inverse covariance matrix. Using a coordinate descent procedure for the lasso, we develop a simple algorithm that is remarkably fast: in the worst cases, it solves a 1000 node problem (~500,000 parameters) in about a minute, and is 50 to 2000 times faster than competing methods. It also provides a conceptual link between the exact problem and the approximation suggested by Meinhausen and Buhlmann (2006). We illustrate the method on some cell-signaling data from proteomics.Comment: submitte

arXiv.org e-Print Archive

CiteSeerX