71 research outputs found
A Selective Review of Group Selection in High-Dimensional Models
Grouping structures arise naturally in many statistical modeling problems.
Several methods have been proposed for variable selection that respect grouping
structure in variables. Examples include the group LASSO and several concave
group selection methods. In this article, we give a selective review of group
selection concerning methodological developments, theoretical properties and
computational algorithms. We pay particular attention to group selection
methods involving concave penalties. We address both group selection and
bi-level selection methods. We describe several applications of these methods
in nonparametric additive models, semiparametric regression, seemingly
unrelated regressions, genomic data analysis and genome wide association
studies. We also highlight some issues that require further study.Comment: Published in at http://dx.doi.org/10.1214/12-STS392 the Statistical
Science (http://www.imstat.org/sts/) by the Institute of Mathematical
Statistics (http://www.imstat.org
APPLE: Approximate Path for Penalized Likelihood Estimators
In high-dimensional data analysis, penalized likelihood estimators are shown
to provide superior results in both variable selection and parameter
estimation. A new algorithm, APPLE, is proposed for calculating the Approximate
Path for Penalized Likelihood Estimators. Both the convex penalty (such as
LASSO) and the nonconvex penalty (such as SCAD and MCP) cases are considered.
The APPLE efficiently computes the solution path for the penalized likelihood
estimator using a hybrid of the modified predictor-corrector method and the
coordinate-descent algorithm. APPLE is compared with several well-known
packages via simulation and analysis of two gene expression data sets.Comment: 24 pages, 9 figure
Regularized Maximum Likelihood Estimation and Feature Selection in Mixtures-of-Experts Models
Mixture of Experts (MoE) are successful models for modeling heterogeneous
data in many statistical learning problems including regression, clustering and
classification. Generally fitted by maximum likelihood estimation via the
well-known EM algorithm, their application to high-dimensional problems is
still therefore challenging. We consider the problem of fitting and feature
selection in MoE models, and propose a regularized maximum likelihood
estimation approach that encourages sparse solutions for heterogeneous
regression data models with potentially high-dimensional predictors. Unlike
state-of-the art regularized MLE for MoE, the proposed modelings do not require
an approximate of the penalty function. We develop two hybrid EM algorithms: an
Expectation-Majorization-Maximization (EM/MM) algorithm, and an EM algorithm
with coordinate ascent algorithm. The proposed algorithms allow to
automatically obtaining sparse solutions without thresholding, and avoid matrix
inversion by allowing univariate parameter updates. An experimental study shows
the good performance of the algorithms in terms of recovering the actual sparse
solutions, parameter estimation, and clustering of heterogeneous regression
data
Estimation and Feature Selection in Mixtures of Generalized Linear Experts Models
Mixtures-of-Experts (MoE) are conditional mixture models that have shown
their performance in modeling heterogeneity in data in many statistical
learning approaches for prediction, including regression and classification, as
well as for clustering. Their estimation in high-dimensional problems is still
however challenging. We consider the problem of parameter estimation and
feature selection in MoE models with different generalized linear experts
models, and propose a regularized maximum likelihood estimation that
efficiently encourages sparse solutions for heterogeneous data with
high-dimensional predictors. The developed proximal-Newton EM algorithm
includes proximal Newton-type procedures to update the model parameter by
monotonically maximizing the objective function and allows to perform efficient
estimation and feature selection. An experimental study shows the good
performance of the algorithms in terms of recovering the actual sparse
solutions, parameter estimation, and clustering of heterogeneous regression
data, compared to the main state-of-the art competitors.Comment: arXiv admin note: text overlap with arXiv:1810.1216
Group Lasso for high dimensional sparse quantile regression models
This paper studies the statistical properties of the group Lasso estimator
for high dimensional sparse quantile regression models where the number of
explanatory variables (or the number of groups of explanatory variables) is
possibly much larger than the sample size while the number of variables in
"active" groups is sufficiently small. We establish a non-asymptotic bound on
the -estimation error of the estimator. This bound explains
situations under which the group Lasso estimator is potentially
superior/inferior to the -penalized quantile regression estimator in
terms of the estimation error. We also propose a data-dependent choice of the
tuning parameter to make the method more practical, by extending the original
proposal of Belloni and Chernozhukov (2011) for the -penalized
quantile regression estimator. As an application, we analyze high dimensional
additive quantile regression models. We show that under a set of suitable
regularity conditions, the group Lasso estimator can attain the convergence
rate arbitrarily close to the oracle rate. Finally, we conduct simulations
experiments to examine our theoretical results.Comment: 37 pages. Some errors are correcte
Optimization with Sparsity-Inducing Penalties
Sparse estimation methods are aimed at using or obtaining parsimonious
representations of data or models. They were first dedicated to linear variable
selection but numerous extensions have now emerged such as structured sparsity
or kernel selection. It turns out that many of the related estimation problems
can be cast as convex optimization problems by regularizing the empirical risk
with appropriate non-smooth norms. The goal of this paper is to present from a
general perspective optimization tools and techniques dedicated to such
sparsity-inducing penalties. We cover proximal methods, block-coordinate
descent, reweighted -penalized techniques, working-set and homotopy
methods, as well as non-convex formulations and extensions, and provide an
extensive set of experiments to compare various algorithms from a computational
point of view
A Generic Path Algorithm for Regularized Statistical Estimation
Regularization is widely used in statistics and machine learning to prevent
overfitting and gear solution towards prior information. In general, a
regularized estimation problem minimizes the sum of a loss function and a
penalty term. The penalty term is usually weighted by a tuning parameter and
encourages certain constraints on the parameters to be estimated. Particular
choices of constraints lead to the popular lasso, fused-lasso, and other
generalized penalized regression methods. Although there has been a lot
of research in this area, developing efficient optimization methods for many
nonseparable penalties remains a challenge. In this article we propose an exact
path solver based on ordinary differential equations (EPSODE) that works for
any convex loss function and can deal with generalized penalties as well
as more complicated regularization such as inequality constraints encountered
in shape-restricted regressions and nonparametric density estimation. In the
path following process, the solution path hits, exits, and slides along the
various constraints and vividly illustrates the tradeoffs between goodness of
fit and model parsimony. In practice, the EPSODE can be coupled with AIC, BIC,
or cross-validation to select an optimal tuning parameter. Our
applications to generalized regularized generalized linear models,
shape-restricted regressions, Gaussian graphical models, and nonparametric
density estimation showcase the potential of the EPSODE algorithm.Comment: 28 pages, 5 figure
- …