1,880 research outputs found
Optimization with Sparsity-Inducing Penalties
Sparse estimation methods are aimed at using or obtaining parsimonious
representations of data or models. They were first dedicated to linear variable
selection but numerous extensions have now emerged such as structured sparsity
or kernel selection. It turns out that many of the related estimation problems
can be cast as convex optimization problems by regularizing the empirical risk
with appropriate non-smooth norms. The goal of this paper is to present from a
general perspective optimization tools and techniques dedicated to such
sparsity-inducing penalties. We cover proximal methods, block-coordinate
descent, reweighted -penalized techniques, working-set and homotopy
methods, as well as non-convex formulations and extensions, and provide an
extensive set of experiments to compare various algorithms from a computational
point of view
Smoothing Proximal Gradient Method for General Structured Sparse Learning
We study the problem of learning high dimensional regression models
regularized by a structured-sparsity-inducing penalty that encodes prior
structural information on either input or output sides. We consider two widely
adopted types of such penalties as our motivating examples: 1) overlapping
group lasso penalty, based on the l1/l2 mixed-norm penalty, and 2) graph-guided
fusion penalty. For both types of penalties, due to their non-separability,
developing an efficient optimization method has remained a challenging problem.
In this paper, we propose a general optimization approach, called smoothing
proximal gradient method, which can solve the structured sparse regression
problems with a smooth convex loss and a wide spectrum of
structured-sparsity-inducing penalties. Our approach is based on a general
smoothing technique of Nesterov. It achieves a convergence rate faster than the
standard first-order method, subgradient method, and is much more scalable than
the most widely used interior-point method. Numerical results are reported to
demonstrate the efficiency and scalability of the proposed method.Comment: arXiv admin note: substantial text overlap with arXiv:1005.471
Smoothing proximal gradient method for general structured sparse regression
We study the problem of estimating high-dimensional regression models
regularized by a structured sparsity-inducing penalty that encodes prior
structural information on either the input or output variables. We consider two
widely adopted types of penalties of this kind as motivating examples: (1) the
general overlapping-group-lasso penalty, generalized from the group-lasso
penalty; and (2) the graph-guided-fused-lasso penalty, generalized from the
fused-lasso penalty. For both types of penalties, due to their nonseparability
and nonsmoothness, developing an efficient optimization method remains a
challenging problem. In this paper we propose a general optimization approach,
the smoothing proximal gradient (SPG) method, which can solve structured sparse
regression problems with any smooth convex loss under a wide spectrum of
structured sparsity-inducing penalties. Our approach combines a smoothing
technique with an effective proximal gradient method. It achieves a convergence
rate significantly faster than the standard first-order methods, subgradient
methods, and is much more scalable than the most widely used interior-point
methods. The efficiency and scalability of our method are demonstrated on both
simulation experiments and real genetic data sets.Comment: Published in at http://dx.doi.org/10.1214/11-AOAS514 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
The Convergence Guarantees of a Non-convex Approach for Sparse Recovery
In the area of sparse recovery, numerous researches hint that non-convex
penalties might induce better sparsity than convex ones, but up until now those
corresponding non-convex algorithms lack convergence guarantees from the
initial solution to the global optimum. This paper aims to provide performance
guarantees of a non-convex approach for sparse recovery. Specifically, the
concept of weak convexity is incorporated into a class of sparsity-inducing
penalties to characterize the non-convexity. Borrowing the idea of the
projected subgradient method, an algorithm is proposed to solve the non-convex
optimization problem. In addition, a uniform approximate projection is adopted
in the projection step to make this algorithm computationally tractable for
large scale problems. The convergence analysis is provided in the noisy
scenario. It is shown that if the non-convexity of the penalty is below a
threshold (which is in inverse proportion to the distance between the initial
solution and the sparse signal), the recovered solution has recovery error
linear in both the step size and the noise term. Numerical simulations are
implemented to test the performance of the proposed approach and verify the
theoretical analysis.Comment: 33 pages, 7 figure
- …