283 research outputs found
A Proximal-Gradient Homotopy Method for the Sparse Least-Squares Problem
We consider solving the -regularized least-squares (-LS)
problem in the context of sparse recovery, for applications such as compressed
sensing. The standard proximal gradient method, also known as iterative
soft-thresholding when applied to this problem, has low computational cost per
iteration but a rather slow convergence rate. Nevertheless, when the solution
is sparse, it often exhibits fast linear convergence in the final stage. We
exploit the local linear convergence using a homotopy continuation strategy,
i.e., we solve the -LS problem for a sequence of decreasing values of
the regularization parameter, and use an approximate solution at the end of
each stage to warm start the next stage. Although similar strategies have been
studied in the literature, there have been no theoretical analysis of their
global iteration complexity. This paper shows that under suitable assumptions
for sparse recovery, the proposed homotopy strategy ensures that all iterates
along the homotopy solution path are sparse. Therefore the objective function
is effectively strongly convex along the solution path, and geometric
convergence at each stage can be established. As a result, the overall
iteration complexity of our method is for finding an
-optimal solution, which can be interpreted as global geometric rate
of convergence. We also present empirical results to support our theoretical
analysis
Optimization with Sparsity-Inducing Penalties
Sparse estimation methods are aimed at using or obtaining parsimonious
representations of data or models. They were first dedicated to linear variable
selection but numerous extensions have now emerged such as structured sparsity
or kernel selection. It turns out that many of the related estimation problems
can be cast as convex optimization problems by regularizing the empirical risk
with appropriate non-smooth norms. The goal of this paper is to present from a
general perspective optimization tools and techniques dedicated to such
sparsity-inducing penalties. We cover proximal methods, block-coordinate
descent, reweighted -penalized techniques, working-set and homotopy
methods, as well as non-convex formulations and extensions, and provide an
extensive set of experiments to compare various algorithms from a computational
point of view
Iterative Soft/Hard Thresholding with Homotopy Continuation for Sparse Recovery
In this note, we analyze an iterative soft / hard thresholding algorithm with
homotopy continuation for recovering a sparse signal from noisy data
of a noise level . Under suitable regularity and sparsity conditions,
we design a path along which the algorithm can find a solution which
admits a sharp reconstruction error with an iteration complexity , where and are problem dimensionality and
controls the length of the path. Numerical examples are given to illustrate its
performance.Comment: 5 pages, 4 figure
Let's Make Block Coordinate Descent Go Fast: Faster Greedy Rules, Message-Passing, Active-Set Complexity, and Superlinear Convergence
Block coordinate descent (BCD) methods are widely-used for large-scale
numerical optimization because of their cheap iteration costs, low memory
requirements, amenability to parallelization, and ability to exploit problem
structure. Three main algorithmic choices influence the performance of BCD
methods: the block partitioning strategy, the block selection rule, and the
block update rule. In this paper we explore all three of these building blocks
and propose variations for each that can lead to significantly faster BCD
methods. We (i) propose new greedy block-selection strategies that guarantee
more progress per iteration than the Gauss-Southwell rule; (ii) explore
practical issues like how to implement the new rules when using "variable"
blocks; (iii) explore the use of message-passing to compute matrix or Newton
updates efficiently on huge blocks for problems with a sparse dependency
between variables; and (iv) consider optimal active manifold identification,
which leads to bounds on the "active set complexity" of BCD methods and leads
to superlinear convergence for certain problems with sparse solutions (and in
some cases finite termination at an optimal solution). We support all of our
findings with numerical results for the classic machine learning problems of
least squares, logistic regression, multi-class logistic regression, label
propagation, and L1-regularization
Low Complexity Regularization of Linear Inverse Problems
Inverse problems and regularization theory is a central theme in contemporary
signal processing, where the goal is to reconstruct an unknown signal from
partial indirect, and possibly noisy, measurements of it. A now standard method
for recovering the unknown signal is to solve a convex optimization problem
that enforces some prior knowledge about its structure. This has proved
efficient in many problems routinely encountered in imaging sciences,
statistics and machine learning. This chapter delivers a review of recent
advances in the field where the regularization prior promotes solutions
conforming to some notion of simplicity/low-complexity. These priors encompass
as popular examples sparsity and group sparsity (to capture the compressibility
of natural signals and images), total variation and analysis sparsity (to
promote piecewise regularity), and low-rank (as natural extension of sparsity
to matrix-valued data). Our aim is to provide a unified treatment of all these
regularizations under a single umbrella, namely the theory of partial
smoothness. This framework is very general and accommodates all low-complexity
regularizers just mentioned, as well as many others. Partial smoothness turns
out to be the canonical way to encode low-dimensional models that can be linear
spaces or more general smooth manifolds. This review is intended to serve as a
one stop shop toward the understanding of the theoretical properties of the
so-regularized solutions. It covers a large spectrum including: (i) recovery
guarantees and stability to noise, both in terms of -stability and
model (manifold) identification; (ii) sensitivity analysis to perturbations of
the parameters involved (in particular the observations), with applications to
unbiased risk estimation ; (iii) convergence properties of the forward-backward
proximal splitting scheme, that is particularly well suited to solve the
corresponding large-scale regularized optimization problem
Computational Methods for Sparse Solution of Linear Inverse Problems
The goal of the sparse approximation problem is to approximate a target signal using a linear combination of a few elementary signals drawn from a fixed collection. This paper surveys the major practical algorithms for sparse approximation. Specific attention is paid to computational issues, to the circumstances in which individual methods tend to perform well, and to the theoretical guarantees available. Many fundamental questions in electrical engineering, statistics, and applied mathematics can be posed as sparse approximation problems, making these algorithms versatile and relevant to a plethora of applications
- …