SLOPE - Adaptive variable selection via convex optimization

Berg, Ewout van den; Bogdan, Małgorzata; Candès, Emmanuel J.; Sabatti, Chiara; Su, Weijie

research

SLOPE - Adaptive variable selection via convex optimization

Authors: Ewout van den Berg
Małgorzata Bogdan
Emmanuel J. Candès
Chiara Sabatti
Weijie Su
Publication date: 1 January 2015
Publisher: 'Institute of Mathematical Statistics'
Doi

Abstract

We introduce a new estimator for the vector of coefficients

\beta

in the linear model

y=X\beta+z

, where

X

has dimensions

n\times p

with

p

possibly larger than

n

. SLOPE, short for Sorted L-One Penalized Estimation, is the solution to

\min_{b\in\mathbb{R}^p}\frac{1}{2}\Vert y-Xb\Vert _{\ell_2}^2+\lambda_1\vert b\vert _{(1)}+\lambda_2\vert b\vert_{(2)}+\cdots+\lambda_p\vert b\vert_{(p)},

where

\lambda_1\ge\lambda_2\ge\cdots\ge\lambda_p\ge0

and

\vert b\vert_{(1)}\ge\vert b\vert_{(2)}\ge\cdots\ge\vert b\vert_{(p)}

are the decreasing absolute values of the entries of

b

. This is a convex program and we demonstrate a solution algorithm whose computational complexity is roughly comparable to that of classical

\ell_1

procedures such as the Lasso. Here, the regularizer is a sorted

\ell_1

norm, which penalizes the regression coefficients according to their rank: the higher the rank - that is, stronger the signal - the larger the penalty. This is similar to the Benjamini and Hochberg [J. Roy. Statist. Soc. Ser. B 57 (1995) 289-300] procedure (BH) which compares more significant

p

-values with more stringent thresholds. One notable choice of the sequence

\{\lambda_i\}

is given by the BH critical values

\lambda_{\mathrm {BH}}(i)=z(1-i\cdot q/2p)

, where

q\in(0,1)

and

z(\alpha)

is the quantile of a standard normal distribution. SLOPE aims to provide finite sample guarantees on the selected model; of special interest is the false discovery rate (FDR), defined as the expected proportion of irrelevant regressors among all selected predictors. Under orthogonal designs, SLOPE with

\lambda_{\mathrm{BH}}

provably controls FDR at level

q

. Moreover, it also appears to have appreciable inferential properties under more general designs

X

while having substantial power, as demonstrated in a series of experiments running on both simulated and real data.Comment: Published at http://dx.doi.org/10.1214/15-AOAS842 in the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

Kosmopolis

oai:repository.upenn.edu:stati...

Last time updated on 09/07/2019

ScholarlyCommons@Penn

oai:repository.upenn.edu:stati...

Last time updated on 02/12/2017