193 research outputs found
Group descent algorithms for nonconvex penalized linear and logistic regression models with grouped predictors
Penalized regression is an attractive framework for variable selection
problems. Often, variables possess a grouping structure, and the relevant
selection problem is that of selecting groups, not individual variables. The
group lasso has been proposed as a way of extending the ideas of the lasso to
the problem of group selection. Nonconvex penalties such as SCAD and MCP have
been proposed and shown to have several advantages over the lasso; these
penalties may also be extended to the group selection problem, giving rise to
group SCAD and group MCP methods. Here, we describe algorithms for fitting
these models stably and efficiently. In addition, we present simulation results
and real data examples comparing and contrasting the statistical properties of
these methods
Group descent algorithms for nonconvex penalized linear and logistic regression models with grouped predictors
Abstract Penalized regression is an attractive framework for variable selection problems. Often, variables possess a grouping structure, and the relevant selection problem is that of selecting groups, not individual variables. The group lasso has been proposed as a way of extending the ideas of the lasso to the problem of group selection. Nonconvex penalties such as SCAD and MCP have been proposed and shown to have several advantages over the lasso; these penalties may also be extended to the group selection problem, giving rise to group SCAD and group MCP methods. Here, we describe algorithms for fitting these models stably and efficiently. In addition, we present simulation results and real data examples comparing and contrasting the statistical properties of these methods
Strong rules for nonconvex penalties and their implications for efficient algorithms in high-dimensional regression
We consider approaches for improving the efficiency of algorithms for fitting
nonconvex penalized regression models such as SCAD and MCP in high dimensions.
In particular, we develop rules for discarding variables during cyclic
coordinate descent. This dimension reduction leads to a substantial improvement
in the speed of these algorithms for high-dimensional problems. The rules we
propose here eliminate a substantial fraction of the variables from the
coordinate descent algorithm. Violations are quite rare, especially in the
locally convex region of the solution path, and furthermore, may be easily
detected and corrected by checking the Karush-Kuhn-Tucker conditions. We extend
these rules to generalized linear models, as well as to other nonconvex
penalties such as the -stabilized Mnet penalty, group MCP, and group
SCAD. We explore three variants of the coordinate decent algorithm that
incorporate these rules and study the efficiency of these algorithms in fitting
models to both simulated data and on real data from a genome-wide association
study
A Selective Review of Group Selection in High-Dimensional Models
Grouping structures arise naturally in many statistical modeling problems.
Several methods have been proposed for variable selection that respect grouping
structure in variables. Examples include the group LASSO and several concave
group selection methods. In this article, we give a selective review of group
selection concerning methodological developments, theoretical properties and
computational algorithms. We pay particular attention to group selection
methods involving concave penalties. We address both group selection and
bi-level selection methods. We describe several applications of these methods
in nonparametric additive models, semiparametric regression, seemingly
unrelated regressions, genomic data analysis and genome wide association
studies. We also highlight some issues that require further study.Comment: Published in at http://dx.doi.org/10.1214/12-STS392 the Statistical
Science (http://www.imstat.org/sts/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Recommended from our members
Iterative hard thresholding in genome-wide association studies: Generalized linear models, prior weights, and double sparsity.
BackgroundConsecutive testing of single nucleotide polymorphisms (SNPs) is usually employed to identify genetic variants associated with complex traits. Ideally one should model all covariates in unison, but most existing analysis methods for genome-wide association studies (GWAS) perform only univariate regression.ResultsWe extend and efficiently implement iterative hard thresholding (IHT) for multiple regression, treating all SNPs simultaneously. Our extensions accommodate generalized linear models, prior information on genetic variants, and grouping of variants. In our simulations, IHT recovers up to 30% more true predictors than SNP-by-SNP association testing and exhibits a 2-3 orders of magnitude decrease in false-positive rates compared with lasso regression. We also test IHT on the UK Biobank hypertension phenotypes and the Northern Finland Birth Cohort of 1966 cardiovascular phenotypes. We find that IHT scales to the large datasets of contemporary human genetics and recovers the plausible genetic variants identified by previous studies.ConclusionsOur real data analysis and simulation studies suggest that IHT can (i) recover highly correlated predictors, (ii) avoid over-fitting, (iii) deliver better true-positive and false-positive rates than either marginal testing or lasso regression, (iv) recover unbiased regression coefficients, (v) exploit prior information and group-sparsity, and (vi) be used with biobank-sized datasets. Although these advances are studied for genome-wide association studies inference, our extensions are pertinent to other regression problems with large numbers of predictors
Optimization with Sparsity-Inducing Penalties
Sparse estimation methods are aimed at using or obtaining parsimonious
representations of data or models. They were first dedicated to linear variable
selection but numerous extensions have now emerged such as structured sparsity
or kernel selection. It turns out that many of the related estimation problems
can be cast as convex optimization problems by regularizing the empirical risk
with appropriate non-smooth norms. The goal of this paper is to present from a
general perspective optimization tools and techniques dedicated to such
sparsity-inducing penalties. We cover proximal methods, block-coordinate
descent, reweighted -penalized techniques, working-set and homotopy
methods, as well as non-convex formulations and extensions, and provide an
extensive set of experiments to compare various algorithms from a computational
point of view
Distributed Quantile Regression Analysis and a Group Variable Selection Method
This dissertation develops novel methodologies for distributed quantile regression analysis
for big data by utilizing a distributed optimization algorithm called the alternating direction
method of multipliers (ADMM). Specifically, we first write the penalized quantile regression
into a specific form that can be solved by the ADMM and propose numerical algorithms
for solving the ADMM subproblems. This results in the distributed QR-ADMM
algorithm. Then, to further reduce the computational time, we formulate the penalized
quantile regression into another equivalent ADMM form in which all the subproblems have
exact closed-form solutions and hence avoid iterative numerical methods. This results in the
single-loop QPADM algorithm that further improve on the computational efficiency of the
QR-ADMM. Both QR-ADMM and QPADM enjoy flexible parallelization by enabling data
splitting across both sample space and feature space, which make them especially appealing
for the case when both sample size n and feature dimension p are large.
Besides the QR-ADMM and QPADM algorithms for penalized quantile regression, we
also develop a group variable selection method by approximating the Bayesian information
criterion. Unlike existing penalization methods for feature selection, our proposed gMIC
algorithm is free of parameter tuning and hence enjoys greater computational efficiency.
Although the current version of gMIC focuses on the generalized linear model, it can be
naturally extended to the quantile regression for feature selection.
We provide theoretical analysis for our proposed methods. Specifically, we conduct numerical
convergence analysis for the QR-ADMM and QPADM algorithms, and provide
asymptotical theories and oracle property of feature selection for the gMIC method. All
our methods are evaluated with simulation studies and real data analysis
- …