Search CORE

9 research outputs found

On the Power of Preconditioning in Sparse Linear Regression

Author: Kelner Jonathan
Koehler Frederic
Meka Raghu
Rohatgi Dhruv
Publication venue
Publication date: 16/06/2021
Field of study

Sparse linear regression is a fundamental problem in high-dimensional statistics, but strikingly little is known about how to efficiently solve it without restrictive conditions on the design matrix. We consider the (correlated) random design setting, where the covariates are independently drawn from a multivariate Gaussian

N(0,\Sigma)

with

\Sigma : n \times n

, and seek estimators

\hat{w}

minimizing

(\hat{w}-w^*)^T\Sigma(\hat{w}-w^*)

, where

w^*

is the

k

-sparse ground truth. Information theoretically, one can achieve strong error bounds with

O(k \log n)

samples for arbitrary

\Sigma

and

w^*

; however, no efficient algorithms are known to match these guarantees even with

o(n)

samples, without further assumptions on

\Sigma

w^*

. As far as hardness, computational lower bounds are only known with worst-case design matrices. Random-design instances are known which are hard for the Lasso, but these instances can generally be solved by Lasso after a simple change-of-basis (i.e. preconditioning). In this work, we give upper and lower bounds clarifying the power of preconditioning in sparse linear regression. First, we show that the preconditioned Lasso can solve a large class of sparse linear regression problems nearly optimally: it succeeds whenever the dependency structure of the covariates, in the sense of the Markov property, has low treewidth -- even if

\Sigma

is highly ill-conditioned. Second, we construct (for the first time) random-design instances which are provably hard for an optimally preconditioned Lasso. In fact, we complete our treewidth classification by proving that for any treewidth-

t

graph, there exists a Gaussian Markov Random Field on this graph such that the preconditioned Lasso, with any choice of preconditioner, requires

\Omega(t^{1/20})

samples to recover

O(\log n)

-sparse signals when covariates are drawn from this model.Comment: 73 pages, 5 figure

arXiv.org e-Print Archive

DSpace@MIT

Learning with Structured Sparsity: From Discrete to Convex and Back.

Author: El Halabi Marwa
Publication venue
Publication date: 14/06/2018
Field of study

In modern-data analysis applications, the abundance of data makes extracting meaningful information from it challenging, in terms of computation, storage, and interpretability. In this setting, exploiting sparsity in data has been essential to the development of scalable methods to problems in machine learning, statistics and signal processing. However, in various applications, the input variables exhibit structure beyond simple sparsity. This motivated the introduction of structured sparsity models, which capture such sophisticated structures, leading to a significant performance gains and better interpretability. Structured sparse approaches have been successfully applied in a variety of domains including computer vision, text processing, medical imaging, and bioinformatics. The goal of this thesis is to improve on these methods and expand their success to a wider range of applications. We thus develop novel methods to incorporate general structure a priori in learning problems, which balance computational and statistical efficiency trade-offs. To achieve this, our results bring together tools from the rich areas of discrete and convex optimization. Applying structured sparsity approaches in general is challenging because structures encountered in practice are naturally combinatorial. An effective approach to circumvent this computational challenge is to employ continuous convex relaxations. We thus start by introducing a new class of structured sparsity models, able to capture a large range of structures, which admit tight convex relaxations amenable to efficient optimization. We then present an in-depth study of the geometric and statistical properties of convex relaxations of general combinatorial structures. In particular, we characterize which structure is lost by imposing convexity and which is preserved. We then focus on the optimization of the convex composite problems that result from the convex relaxations of structured sparsity models. We develop efficient algorithmic tools to solve these problems in a non-Euclidean setting, leading to faster convergence in some cases. Finally, to handle structures that do not admit meaningful convex relaxations, we propose to use, as a heuristic, a non-convex proximal gradient method, efficient for several classes of structured sparsity models. We further extend this method to address a probabilistic structured sparsity model, we introduce to model approximately sparse signals

Infoscience - École polytechnique fédérale de Lausanne