Search CORE

2,599 research outputs found

The degrees of freedom of the Lasso for general design matrix

Author: Chesneau Christophe
Dossal Charles
Fadili Jalal M.
Kachour Maher
Peyré Gabriel
Publication venue
Publication date: 28/05/2012
Field of study

In this paper, we investigate the degrees of freedom (\dof) of penalized

\ell_1

minimization (also known as the Lasso) for linear regression models. We give a closed-form expression of the \dof of the Lasso response. Namely, we show that for any given Lasso regularization parameter

\lambda

and any observed data

y

belonging to a set of full (Lebesgue) measure, the cardinality of the support of a particular solution of the Lasso problem is an unbiased estimator of the degrees of freedom. This is achieved without the need of uniqueness of the Lasso solution. Thus, our result holds true for both the underdetermined and the overdetermined case, where the latter was originally studied in \cite{zou}. We also show, by providing a simple counterexample, that although the \dof theorem of \cite{zou} is correct, their proof contains a flaw since their divergence formula holds on a different set of a full measure than the one that they claim. An effective estimator of the number of degrees of freedom may have several applications including an objectively guided choice of the regularization parameter in the Lasso through the \sure framework. Our theoretical findings are illustrated through several numerical simulations.Comment: A short version appeared in SPARS'11, June 2011 Previously entitled "The degrees of freedom of penalized l1 minimization

arXiv.org e-Print Archive

HAL - Normandie Université

Base de publications de l'université Paris-Dauphine

Oskar Bordeaux

Sparse logistic principal components analysis for binary data

Author: Hu Jianhua
Huang Jianhua Z.
Lee Seokho
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2010
Field of study

We develop a new principal components analysis (PCA) type dimension reduction method for binary data. Different from the standard PCA which is defined on the observed data, the proposed PCA is defined on the logit transform of the success probabilities of the binary observations. Sparsity is introduced to the principal component (PC) loading vectors for enhanced interpretability and more stable extraction of the principal components. Our sparse PCA is formulated as solving an optimization problem with a criterion function motivated from a penalized Bernoulli likelihood. A Majorization--Minimization algorithm is developed to efficiently solve the optimization problem. The effectiveness of the proposed sparse logistic PCA method is illustrated by application to a single nucleotide polymorphism data set and a simulation study.Comment: Published in at http://dx.doi.org/10.1214/10-AOAS327 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

A Path Algorithm for Constrained Estimation

Author: Brunk H. D.
Chen X.
de Leeuw J.
Dempster A. P.
Efron B.
Efron B.
Friedman J.
Friedman J.
Goodnight J. H.
Grenander U.
Groeneboom P.
Hanson D. L.
Hildreth C.
Hua Zhou
Jennrich R.
Kenneth Lange
Lange K.
Lawson C. L.
Li C.
Little R. J. A.
Liu J.
Magnus J. R.
Mammen E.
Meyer M.
Nocedal J.
Robertson T.
Rosset S.
Ruszczyński A.
Savage C.
Schoenfeld D. A.
Shen X.
Silvapulle M. J.
Stein C. M.
Tibshirani R.
Tibshirani R. J.
Wu T. T.
Zou H.
Publication venue: 'Informa UK Limited'
Publication date: 18/03/2011
Field of study

Many least squares problems involve affine equality and inequality constraints. Although there are variety of methods for solving such problems, most statisticians find constrained estimation challenging. The current paper proposes a new path following algorithm for quadratic programming based on exact penalization. Similar penalties arise in

l_1

regularization in model selection. Classical penalty methods solve a sequence of unconstrained problems that put greater and greater stress on meeting the constraints. In the limit as the penalty constant tends to

\infty

, one recovers the constrained solution. In the exact penalty method, squared penalties are replaced by absolute value penalties, and the solution is recovered for a finite value of the penalty constant. The exact path following method starts at the unconstrained solution and follows the solution path as the penalty constant increases. In the process, the solution path hits, slides along, and exits from the various constraints. Path following in lasso penalized regression, in contrast, starts with a large value of the penalty constant and works its way downward. In both settings, inspection of the entire solution path is revealing. Just as with the lasso and generalized lasso, it is possible to plot the effective degrees of freedom along the solution path. For a strictly convex quadratic program, the exact penalty algorithm can be framed entirely in terms of the sweep operator of regression analysis. A few well chosen examples illustrate the mechanics and potential of path following.Comment: 26 pages, 5 figure

arXiv.org e-Print Archive

CiteSeerX

Crossref

One-step estimator paths for concave regularization

Author: Taddy Matt
Publication venue
Publication date: 01/05/2016
Field of study

The statistics literature of the past 15 years has established many favorable properties for sparse diminishing-bias regularization: techniques which can roughly be understood as providing estimation under penalty functions spanning the range of concavity between

L_0

and

L_1

norms. However, lasso

L_1

-regularized estimation remains the standard tool for industrial `Big Data' applications because of its minimal computational cost and the presence of easy-to-apply rules for penalty selection. In response, this article proposes a simple new algorithm framework that requires no more computation than a lasso path: the path of one-step estimators (POSE) does

L_1

penalized regression estimation on a grid of decreasing penalties, but adapts coefficient-specific weights to decrease as a function of the coefficient estimated in the previous path step. This provides sparse diminishing-bias regularization at no extra cost over the fastest lasso algorithms. Moreover, our `gamma lasso' implementation of POSE is accompanied by a reliable heuristic for the fit degrees of freedom, so that standard information criteria can be applied in penalty selection. We also provide novel results on the distance between weighted-

L_1

and

L_0

penalized predictors; this allows us to build intuition about POSE and other diminishing-bias regularization schemes. The methods and results are illustrated in extensive simulations and in application of logistic regression to evaluating the performance of hockey players.Comment: Data and code are in the gamlr package for R. Supplemental appendix is at https://github.com/TaddyLab/pose/raw/master/paper/supplemental.pd

arXiv.org e-Print Archive

FigShare

Biomarker Detection in Association Studies: Modeling SNPs Simultaneously via Logistic ANOVA

Author: Hu Jianhua
Huang Jianhua
Jung Yoonsuh
Publication venue: 'Informa UK Limited'
Publication date: 12/06/2014
Field of study

In genome-wide association studies, the primary task is to detect biomarkers in the form of Single Nucleotide Polymorphisms (SNPs) that have nontrivial associations with a disease phenotype and some other important clinical/environmental factors. However, the extremely large number of SNPs comparing to the sample size inhibits application of classical methods such as the multiple logistic regression. Currently the most commonly used approach is still to analyze one SNP at a time. In this pa- per, we propose to consider the genotypes of the SNPs simultaneously via a logistic analysis of variance (ANOVA) model, which expresses the logit transformed mean of SNP genotypes as the summation of the SNP effects, effects of the disease phenotype and/or other clinical variables, and the interaction effects. We use a reduced-rank representation of the interaction-effect matrix for dimensionality reduction, and employ the L1-penalty in a penalized likelihood framework to filter out the SNPs that have no associations. We develop a Majorization-Minimization algorithm for computational implementation. In addition, we propose a modified BIC criterion to select the penalty parameters and determine the rank number. The proposed method is applied to a Multiple Sclerosis data set and simulated data sets and shows promise in biomarker detection

Research Commons@Waikato

PubMed Central

Regularized Multivariate Regression Models with Skew-\u3cem\u3et\u3c/em\u3e Error Distributions

Author: Chen Lianfu
Maadooliat Mehdi
Pourahmadi Mohsen
Publication venue: e-Publications@Marquette
Publication date: 01/06/2014
Field of study

We consider regularization of the parameters in multivariate linear regression models with the errors having a multivariate skew-t distribution. An iterative penalized likelihood procedure is proposed for constructing sparse estimators of both the regression coefficient and inverse scale matrices simultaneously. The sparsity is introduced through penalizing the negative log-likelihood by adding L1-penalties on the entries of the two matrices. Taking advantage of the hierarchical representation of skew-t distributions, and using the expectation conditional maximization (ECM) algorithm, we reduce the problem to penalized normal likelihood and develop a procedure to minimize the ensuing objective function. Using a simulation study the performance of the method is assessed, and the methodology is illustrated using a real data set with a 24-dimensional response vector

epublications@Marquette

Sparse modeling of categorial explanatory variables

Author: Gertheiss Jan
Tutz Gerhard
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2009
Field of study

Shrinking methods in regression analysis are usually designed for metric predictors. In this article, however, shrinkage methods for categorial predictors are proposed. As an application we consider data from the Munich rent standard, where, for example, urban districts are treated as a categorial predictor. If independent variables are categorial, some modifications to usual shrinking procedures are necessary. Two

L_1

-penalty based methods for factor selection and clustering of categories are presented and investigated. The first approach is designed for nominal scale levels, the second one for ordinal predictors. Besides applying them to the Munich rent standard, methods are illustrated and compared in simulation studies.Comment: Published in at http://dx.doi.org/10.1214/10-AOAS355 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

Open Access LMU

Robust and sparse factor modelling.

Author: Croux Christophe
Exterkate Peter
Publication venue
Publication date
Field of study

Factor construction methods are widely used to summarize a large panel of variables by means of a relatively small number of representative factors. We propose a novel factor construction procedure that enjoys the properties of robustness to outliers and of sparsity; that is, having relatively few nonzero factor loadings. Compared to the traditional factor construction method, we find that this procedure leads to a favorable forecasting performance in the presence of outliers and to better interpretable factors. We investigate the performance of the method in a Monte Carlo experiment and in an empirical application to a large data set from macroeconomics.Dimension reduction; Forecasting; Outliers; Regularization; Sparsity;

Research Papers in Economics

The composite absolute penalties family for grouped and hierarchical variable selection

Author: Rocha Guilherme
Yu Bin
Zhao Peng
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 02/09/2009
Field of study

Extracting useful information from high-dimensional data is an important focus of today's statistical research and practice. Penalized loss function minimization has been shown to be effective for this task both theoretically and empirically. With the virtues of both regularization and sparsity, the

L_1

-penalized squared error minimization method Lasso has been popular in regression models and beyond. In this paper, we combine different norms including

L_1

to form an intelligent penalty in order to add side information to the fitting of a regression or classification model to obtain reasonable estimates. Specifically, we introduce the Composite Absolute Penalties (CAP) family, which allows given grouping and hierarchical relationships between the predictors to be expressed. CAP penalties are built by defining groups and combining the properties of norm penalties at the across-group and within-group levels. Grouped selection occurs for nonoverlapping groups. Hierarchical variable selection is reached by defining groups with particular overlapping patterns. We propose using the BLASSO and cross-validation to compute CAP estimates in general. For a subfamily of CAP estimates involving only the

L_1

and

L_{\infty}

norms, we introduce the iCAP algorithm to trace the entire regularization path for the grouped selection problem. Within this subfamily, unbiased estimates of the degrees of freedom (df) are derived so that the regularization parameter is selected without cross-validation. CAP is shown to improve on the predictive performance of the LASSO in a series of simulated experiments, including cases with

p\gg n

and possibly mis-specified groupings. When the complexity of a model is properly calculated, iCAP is seen to be parsimonious in the experiments.Comment: Published in at http://dx.doi.org/10.1214/07-AOS584 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref