Search CORE

111 research outputs found

The Graphical Lasso: New Insights and Alternatives

Author: Hastie Trevor
Mazumder Rahul
Publication venue
Publication date: 01/01/2012
Field of study

The graphical lasso \citep{FHT2007a} is an algorithm for learning the structure in an undirected Gaussian graphical model, using

\ell_1

regularization to control the number of zeros in the precision matrix {\B\Theta}={\B\Sigma}^{-1} \citep{BGA2008,yuan_lin_07}. The {\texttt R} package \GL\ \citep{FHT2007a} is popular, fast, and allows one to efficiently build a path of models for different values of the tuning parameter. Convergence of \GL\ can be tricky; the converged precision matrix might not be the inverse of the estimated covariance, and occasionally it fails to converge with warm starts. In this paper we explain this behavior, and propose new algorithms that appear to outperform \GL. By studying the "normal equations" we see that, \GL\ is solving the {\em dual} of the graphical lasso penalized likelihood, by block coordinate ascent; a result which can also be found in \cite{BGA2008}. In this dual, the target of estimation is \B\Sigma, the covariance matrix, rather than the precision matrix \B\Theta. We propose similar primal algorithms \PGL\ and \DPGL, that also operate by block-coordinate descent, where \B\Theta is the optimization target. We study all of these algorithms, and in particular different approaches to solving their coordinate sub-problems. We conclude that \DPGL\ is superior from several points of view.Comment: This is a revised version of our previous manuscript with the same name ArXiv id: http://arxiv.org/abs/1111.547

arXiv.org e-Print Archive

CiteSeerX

DSpace@MIT

Crossref

Least quantile regression via modern optimization

Author: Bertsimas Dimitris
Mazumder Rahul
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/03/2014
Field of study

We address the Least Quantile of Squares (LQS) (and in particular the Least Median of Squares) regression problem using modern optimization methods. We propose a Mixed Integer Optimization (MIO) formulation of the LQS problem which allows us to find a provably global optimal solution for the LQS problem. Our MIO framework has the appealing characteristic that if we terminate the algorithm early, we obtain a solution with a guarantee on its sub-optimality. We also propose continuous optimization methods based on first-order subdifferential methods, sequential linear optimization and hybrid combinations of them to obtain near optimal solutions to the LQS problem. The MIO algorithm is found to benefit significantly from high quality solutions delivered by our continuous optimization based methods. We further show that the MIO approach leads to (a) an optimal solution for any dataset, where the data-points

(y_i,\mathbf{x}_i)

's are not necessarily in general position, (b) a simple proof of the breakdown point of the LQS objective value that holds for any dataset and (c) an extension to situations where there are polyhedral constraints on the regression coefficient vector. We report computational results with both synthetic and real-world datasets showing that the MIO algorithm with warm starts from the continuous optimization methods solve small (

n=100

) and medium (

n=500

) size problems to provable optimality in under two hours, and outperform all publicly available methods for large-scale (

n={}

10,000) LQS problems.Comment: Published in at http://dx.doi.org/10.1214/14-AOS1223 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

DSpace@MIT

Crossref

The Discrete Dantzig Selector: Estimating Sparse Linear Models via Mixed Integer Linear Optimization

Author: Mazumder Rahul
Radchenko Peter
Publication venue
Publication date: 01/01/2017
Field of study

We propose a novel high-dimensional linear regression estimator: the Discrete Dantzig Selector, which minimizes the number of nonzero regression coefficients subject to a budget on the maximal absolute correlation between the features and residuals. Motivated by the significant advances in integer optimization over the past 10-15 years, we present a Mixed Integer Linear Optimization (MILO) approach to obtain certifiably optimal global solutions to this nonconvex optimization problem. The current state of algorithmics in integer optimization makes our proposal substantially more computationally attractive than the least squares subset selection framework based on integer quadratic optimization, recently proposed in [8] and the continuous nonconvex quadratic optimization framework of [33]. We propose new discrete first-order methods, which when paired with state-of-the-art MILO solvers, lead to good solutions for the Discrete Dantzig Selector problem for a given computational budget. We illustrate that our integrated approach provides globally optimal solutions in significantly shorter computation times, when compared to off-the-shelf MILO solvers. We demonstrate both theoretically and empirically that in a wide range of regimes the statistical properties of the Discrete Dantzig Selector are superior to those of popular

\ell_{1}

-based approaches. We illustrate that our approach can handle problem instances with p = 10,000 features with certifiable optimality making it a highly scalable combinatorial variable selection approach in sparse linear modeling

arXiv.org e-Print Archive

Crossref

Projected likelihood contrasts for testing homogeneity in finite mixture models with nuisance parameters

Author: Mazumder Rahul
Sengupta Debapriya
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2008
Field of study

This paper develops a test for homogeneity in finite mixture models where the mixing proportions are known a priori (taken to be 0.5) and a common nuisance parameter is present. Statistical tests based on the notion of Projected Likelihood Contrasts (PLC) are considered. The PLC is a slight modification of the usual likelihood ratio statistic or the Wilk's

\Lambda

and is similar in spirit to the Rao's score test. Theoretical investigations have been carried out to understand the large sample statistical properties of these tests. Simulation studies have been carried out to understand the behavior of the null distribution of the PLC statistic in the case of Gaussian mixtures with unknown means (common variance as nuisance parameter) and unknown variances (common mean as nuisance parameter). The results are in conformity with the theoretical results obtained. Power functions of these tests have been evaluated based on simulations from Gaussian mixtures.Comment: Published in at http://dx.doi.org/10.1214/193940307000000194 the IMS Collections (http://www.imstat.org/publications/imscollections.htm) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

Best Subset Selection via a Modern Optimization Lens

Author: Bertsimas Dimitris
King Angela
Mazumder Rahul
Publication venue
Publication date: 11/07/2015
Field of study

In the last twenty-five years (1990-2014), algorithmic advances in integer optimization combined with hardware improvements have resulted in an astonishing 200 billion factor speedup in solving Mixed Integer Optimization (MIO) problems. We present a MIO approach for solving the classical best subset selection problem of choosing

k

out of

p

features in linear regression given

n

observations. We develop a discrete extension of modern first order continuous optimization methods to find high quality feasible solutions that we use as warm starts to a MIO solver that finds provably optimal solutions. The resulting algorithm (a) provides a solution with a guarantee on its suboptimality even if we terminate the algorithm early, (b) can accommodate side constraints on the coefficients of the linear regression and (c) extends to finding best subset solutions for the least absolute deviation loss function. Using a wide variety of synthetic and real datasets, we demonstrate that our approach solves problems with

n

in the 1000s and

p

in the 100s in minutes to provable optimality, and finds near optimal solutions for

n

in the 100s and

p

in the 1000s in minutes. We also establish via numerical experiments that the MIO approach performs better than {\texttt {Lasso}} and other popularly used sparse learning procedures, in terms of achieving sparse solutions with good predictive power.Comment: This is a revised version (May, 2015) of the first submission in June 201

arXiv.org e-Print Archive

DSpace@MIT

Crossref