1,269 research outputs found
A Fast Active Set Block Coordinate Descent Algorithm for -regularized least squares
The problem of finding sparse solutions to underdetermined systems of linear
equations arises in several applications (e.g. signal and image processing,
compressive sensing, statistical inference). A standard tool for dealing with
sparse recovery is the -regularized least-squares approach that has
been recently attracting the attention of many researchers. In this paper, we
describe an active set estimate (i.e. an estimate of the indices of the zero
variables in the optimal solution) for the considered problem that tries to
quickly identify as many active variables as possible at a given point, while
guaranteeing that some approximate optimality conditions are satisfied. A
relevant feature of the estimate is that it gives a significant reduction of
the objective function when setting to zero all those variables estimated
active. This enables to easily embed it into a given globally converging
algorithmic framework. In particular, we include our estimate into a block
coordinate descent algorithm for -regularized least squares, analyze
the convergence properties of this new active set method, and prove that its
basic version converges with linear rate. Finally, we report some numerical
results showing the effectiveness of the approach.Comment: 28 pages, 5 figure
Super-Linear Convergence of Dual Augmented-Lagrangian Algorithm for Sparsity Regularized Estimation
We analyze the convergence behaviour of a recently proposed algorithm for
regularized estimation called Dual Augmented Lagrangian (DAL). Our analysis is
based on a new interpretation of DAL as a proximal minimization algorithm. We
theoretically show under some conditions that DAL converges super-linearly in a
non-asymptotic and global sense. Due to a special modelling of sparse
estimation problems in the context of machine learning, the assumptions we make
are milder and more natural than those made in conventional analysis of
augmented Lagrangian algorithms. In addition, the new interpretation enables us
to generalize DAL to wide varieties of sparse estimation problems. We
experimentally confirm our analysis in a large scale -regularized
logistic regression problem and extensively compare the efficiency of DAL
algorithm to previously proposed algorithms on both synthetic and benchmark
datasets.Comment: 51 pages, 9 figure
A second derivative SQP method: theoretical issues
Sequential quadratic programming (SQP) methods form a class of highly efficient algorithms for solving nonlinearly constrained optimization problems. Although second derivative information may often be calculated, there is little practical theory that justifies exact-Hessian SQP methods. In particular, the resulting quadratic programming (QP) subproblems are often nonconvex, and thus finding their global solutions may be computationally nonviable. This paper presents a second-derivative SQP method based on quadratic subproblems that are either convex, and thus may be solved efficiently, or need not be solved globally. Additionally, an explicit descent-constraint is imposed on certain QP subproblems, which “guides” the iterates through areas in which nonconvexity is a concern. Global convergence of the resulting algorithm is established
Optimization with Sparsity-Inducing Penalties
Sparse estimation methods are aimed at using or obtaining parsimonious
representations of data or models. They were first dedicated to linear variable
selection but numerous extensions have now emerged such as structured sparsity
or kernel selection. It turns out that many of the related estimation problems
can be cast as convex optimization problems by regularizing the empirical risk
with appropriate non-smooth norms. The goal of this paper is to present from a
general perspective optimization tools and techniques dedicated to such
sparsity-inducing penalties. We cover proximal methods, block-coordinate
descent, reweighted -penalized techniques, working-set and homotopy
methods, as well as non-convex formulations and extensions, and provide an
extensive set of experiments to compare various algorithms from a computational
point of view
Low Complexity Regularization of Linear Inverse Problems
Inverse problems and regularization theory is a central theme in contemporary
signal processing, where the goal is to reconstruct an unknown signal from
partial indirect, and possibly noisy, measurements of it. A now standard method
for recovering the unknown signal is to solve a convex optimization problem
that enforces some prior knowledge about its structure. This has proved
efficient in many problems routinely encountered in imaging sciences,
statistics and machine learning. This chapter delivers a review of recent
advances in the field where the regularization prior promotes solutions
conforming to some notion of simplicity/low-complexity. These priors encompass
as popular examples sparsity and group sparsity (to capture the compressibility
of natural signals and images), total variation and analysis sparsity (to
promote piecewise regularity), and low-rank (as natural extension of sparsity
to matrix-valued data). Our aim is to provide a unified treatment of all these
regularizations under a single umbrella, namely the theory of partial
smoothness. This framework is very general and accommodates all low-complexity
regularizers just mentioned, as well as many others. Partial smoothness turns
out to be the canonical way to encode low-dimensional models that can be linear
spaces or more general smooth manifolds. This review is intended to serve as a
one stop shop toward the understanding of the theoretical properties of the
so-regularized solutions. It covers a large spectrum including: (i) recovery
guarantees and stability to noise, both in terms of -stability and
model (manifold) identification; (ii) sensitivity analysis to perturbations of
the parameters involved (in particular the observations), with applications to
unbiased risk estimation ; (iii) convergence properties of the forward-backward
proximal splitting scheme, that is particularly well suited to solve the
corresponding large-scale regularized optimization problem
A non-adapted sparse approximation of PDEs with stochastic inputs
We propose a method for the approximation of solutions of PDEs with
stochastic coefficients based on the direct, i.e., non-adapted, sampling of
solutions. This sampling can be done by using any legacy code for the
deterministic problem as a black box. The method converges in probability (with
probabilistic error bounds) as a consequence of sparsity and a concentration of
measure phenomenon on the empirical correlation between samples. We show that
the method is well suited for truly high-dimensional problems (with slow decay
in the spectrum)
- …