Search CORE

254 research outputs found

Clusterpath An Algorithm for Clustering using Convex Fusion Penalties

Author: Bach Francis
Hocking Toby Dylan
Joulin Armand
Vert Jean-Philippe
Publication venue: HAL CCSD
Publication date: 27/06/2011
Field of study

International audienceWe present a new clustering algorithm by proposing a convex relaxation of hierarchical clustering, which results in a family of objective functions with a natural geometric interpretation. We give efficient algorithms for calculating the continuous regularization path of solutions, and discuss relative advantages of the parameters. Our method experimentally gives state-of-the-art results similar to spectral clustering for non-convex clusters, and has the added benefit of learning a tree structure from the data

INRIA a CCSD electronic archive server

HAL Descartes

HAL-MINES ParisTech

SLOPE - Adaptive variable selection via convex optimization

Author: Berg Ewout van den
Bogdan Małgorzata
Candès Emmanuel J.
Sabatti Chiara
Su Weijie
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2015
Field of study

We introduce a new estimator for the vector of coefficients

\beta

in the linear model

y=X\beta+z

, where

X

has dimensions

n\times p

with

p

possibly larger than

n

. SLOPE, short for Sorted L-One Penalized Estimation, is the solution to

\min_{b\in\mathbb{R}^p}\frac{1}{2}\Vert y-Xb\Vert _{\ell_2}^2+\lambda_1\vert b\vert _{(1)}+\lambda_2\vert b\vert_{(2)}+\cdots+\lambda_p\vert b\vert_{(p)},

where

\lambda_1\ge\lambda_2\ge\cdots\ge\lambda_p\ge0

and

\vert b\vert_{(1)}\ge\vert b\vert_{(2)}\ge\cdots\ge\vert b\vert_{(p)}

are the decreasing absolute values of the entries of

b

. This is a convex program and we demonstrate a solution algorithm whose computational complexity is roughly comparable to that of classical

\ell_1

procedures such as the Lasso. Here, the regularizer is a sorted

\ell_1

norm, which penalizes the regression coefficients according to their rank: the higher the rank - that is, stronger the signal - the larger the penalty. This is similar to the Benjamini and Hochberg [J. Roy. Statist. Soc. Ser. B 57 (1995) 289-300] procedure (BH) which compares more significant

p

-values with more stringent thresholds. One notable choice of the sequence

\{\lambda_i\}

is given by the BH critical values

\lambda_{\mathrm {BH}}(i)=z(1-i\cdot q/2p)

, where

q\in(0,1)

and

z(\alpha)

is the quantile of a standard normal distribution. SLOPE aims to provide finite sample guarantees on the selected model; of special interest is the false discovery rate (FDR), defined as the expected proportion of irrelevant regressors among all selected predictors. Under orthogonal designs, SLOPE with

\lambda_{\mathrm{BH}}

provably controls FDR at level

q

. Moreover, it also appears to have appreciable inferential properties under more general designs

X

while having substantial power, as demonstrated in a series of experiments running on both simulated and real data.Comment: Published at http://dx.doi.org/10.1214/15-AOAS842 in the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

PubMed Central

ScholarlyCommons@Penn

Parameter Estimation with the Ordered $\ell_{2}$ Regularization via an Alternating Direction Method of Multipliers

Author: Cheng Xueqi
Humayoo Mahammad
Publication venue: 'MDPI AG'
Publication date: 12/10/2019
Field of study

Regularization is a popular technique in machine learning for model estimation and avoiding overfitting. Prior studies have found that modern ordered regularization can be more effective in handling highly correlated, high-dimensional data than traditional regularization. The reason stems from the fact that the ordered regularization can reject irrelevant variables and yield an accurate estimation of the parameters. How to scale up the ordered regularization problems when facing the large-scale training data remains an unanswered question. This paper explores the problem of parameter estimation with the ordered

\ell_{2}

-regularization via Alternating Direction Method of Multipliers (ADMM), called ADMM-O

\ell_{2}

. The advantages of ADMM-O

\ell_{2}

include (i) scaling up the ordered

\ell_{2}

to a large-scale dataset, (ii) predicting parameters correctly by excluding irrelevant variables automatically, and (iii) having a fast convergence rate. Experiment results on both synthetic data and real data indicate that ADMM-O

\ell_{2}

can perform better than or comparable to several state-of-the-art baselines

arXiv.org e-Print Archive

Multidisciplinary Digital Publishing Institute