196 research outputs found
The composite absolute penalties family for grouped and hierarchical variable selection
Extracting useful information from high-dimensional data is an important
focus of today's statistical research and practice. Penalized loss function
minimization has been shown to be effective for this task both theoretically
and empirically. With the virtues of both regularization and sparsity, the
-penalized squared error minimization method Lasso has been popular in
regression models and beyond. In this paper, we combine different norms
including to form an intelligent penalty in order to add side information
to the fitting of a regression or classification model to obtain reasonable
estimates. Specifically, we introduce the Composite Absolute Penalties (CAP)
family, which allows given grouping and hierarchical relationships between the
predictors to be expressed. CAP penalties are built by defining groups and
combining the properties of norm penalties at the across-group and within-group
levels. Grouped selection occurs for nonoverlapping groups. Hierarchical
variable selection is reached by defining groups with particular overlapping
patterns. We propose using the BLASSO and cross-validation to compute CAP
estimates in general. For a subfamily of CAP estimates involving only the
and norms, we introduce the iCAP algorithm to trace the entire
regularization path for the grouped selection problem. Within this subfamily,
unbiased estimates of the degrees of freedom (df) are derived so that the
regularization parameter is selected without cross-validation. CAP is shown to
improve on the predictive performance of the LASSO in a series of simulated
experiments, including cases with and possibly mis-specified
groupings. When the complexity of a model is properly calculated, iCAP is seen
to be parsimonious in the experiments.Comment: Published in at http://dx.doi.org/10.1214/07-AOS584 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
An update on statistical boosting in biomedicine
Statistical boosting algorithms have triggered a lot of research during the
last decade. They combine a powerful machine-learning approach with classical
statistical modelling, offering various practical advantages like automated
variable selection and implicit regularization of effect estimates. They are
extremely flexible, as the underlying base-learners (regression functions
defining the type of effect for the explanatory variables) can be combined with
any kind of loss function (target function to be optimized, defining the type
of regression setting). In this review article, we highlight the most recent
methodological developments on statistical boosting regarding variable
selection, functional regression and advanced time-to-event modelling.
Additionally, we provide a short overview on relevant applications of
statistical boosting in biomedicine
Optimization with Sparsity-Inducing Penalties
Sparse estimation methods are aimed at using or obtaining parsimonious
representations of data or models. They were first dedicated to linear variable
selection but numerous extensions have now emerged such as structured sparsity
or kernel selection. It turns out that many of the related estimation problems
can be cast as convex optimization problems by regularizing the empirical risk
with appropriate non-smooth norms. The goal of this paper is to present from a
general perspective optimization tools and techniques dedicated to such
sparsity-inducing penalties. We cover proximal methods, block-coordinate
descent, reweighted -penalized techniques, working-set and homotopy
methods, as well as non-convex formulations and extensions, and provide an
extensive set of experiments to compare various algorithms from a computational
point of view
Forward stagewise regression and the monotone lasso
We consider the least angle regression and forward stagewise algorithms for
solving penalized least squares regression problems. In Efron, Hastie,
Johnstone & Tibshirani (2004) it is proved that the least angle regression
algorithm, with a small modification, solves the lasso regression problem. Here
we give an analogous result for incremental forward stagewise regression,
showing that it solves a version of the lasso problem that enforces
monotonicity. One consequence of this is as follows: while lasso makes optimal
progress in terms of reducing the residual sum-of-squares per unit increase in
-norm of the coefficient , forward stage-wise is optimal per unit
arc-length traveled along the coefficient path. We also study a condition
under which the coefficient paths of the lasso are monotone, and hence the
different algorithms coincide. Finally, we compare the lasso and forward
stagewise procedures in a simulation study involving a large number of
correlated predictors.Comment: Published at http://dx.doi.org/10.1214/07-EJS004 in the Electronic
Journal of Statistics (http://www.i-journals.org/ejs/) by the Institute of
Mathematical Statistics (http://www.imstat.org
A Convex Relaxation for Weakly Supervised Classifiers
This paper introduces a general multi-class approach to weakly supervised
classification. Inferring the labels and learning the parameters of the model
is usually done jointly through a block-coordinate descent algorithm such as
expectation-maximization (EM), which may lead to local minima. To avoid this
problem, we propose a cost function based on a convex relaxation of the
soft-max loss. We then propose an algorithm specifically designed to
efficiently solve the corresponding semidefinite program (SDP). Empirically,
our method compares favorably to standard ones on different datasets for
multiple instance learning and semi-supervised learning as well as on
clustering tasks.Comment: Appears in Proceedings of the 29th International Conference on
Machine Learning (ICML 2012
- …