101 research outputs found
Successive Convex Approximation Algorithms for Sparse Signal Estimation with Nonconvex Regularizations
In this paper, we propose a successive convex approximation framework for
sparse optimization where the nonsmooth regularization function in the
objective function is nonconvex and it can be written as the difference of two
convex functions. The proposed framework is based on a nontrivial combination
of the majorization-minimization framework and the successive convex
approximation framework proposed in literature for a convex regularization
function. The proposed framework has several attractive features, namely, i)
flexibility, as different choices of the approximate function lead to different
type of algorithms; ii) fast convergence, as the problem structure can be
better exploited by a proper choice of the approximate function and the
stepsize is calculated by the line search; iii) low complexity, as the
approximate function is convex and the line search scheme is carried out over a
differentiable function; iv) guaranteed convergence to a stationary point. We
demonstrate these features by two example applications in subspace learning,
namely, the network anomaly detection problem and the sparse subspace
clustering problem. Customizing the proposed framework by adopting the
best-response type approximation, we obtain soft-thresholding with exact line
search algorithms for which all elements of the unknown parameter are updated
in parallel according to closed-form expressions. The attractive features of
the proposed algorithms are illustrated numerically.Comment: submitted to IEEE Journal of Selected Topics in Signal Processing,
special issue in Robust Subspace Learnin
Successive convex approximation algorithms for sparse signal estimation with nonconvex regularizations
In this paper, we propose a successive convex approximation framework for
sparse optimization where the nonsmooth regularization function in the
objective function is nonconvex and it can be written as the difference of two
convex functions. The proposed framework is based on a nontrivial combination
of the majorization-minimization framework and the successive convex
approximation framework proposed in literature for a convex regularization
function. The proposed framework has several attractive features, namely, i)
flexibility, as different choices of the approximate function lead to different
type of algorithms; ii) fast convergence, as the problem structure can be
better exploited by a proper choice of the approximate function and the
stepsize is calculated by the line search; iii) low complexity, as the
approximate function is convex and the line search scheme is carried out over a
differentiable function; iv) guaranteed convergence to a stationary point. We
demonstrate these features by two example applications in subspace learning,
namely, the network anomaly detection problem and the sparse subspace
clustering problem. Customizing the proposed framework by adopting the
best-response type approximation, we obtain soft-thresholding with exact line
search algorithms for which all elements of the unknown parameter are updated
in parallel according to closed-form expressions. The attractive features of
the proposed algorithms are illustrated numerically.Comment: submitted to IEEE Journal of Selected Topics in Signal Processing,
special issue in Robust Subspace Learnin
Inexact Block Coordinate Descent Algorithms for Nonsmooth Nonconvex Optimization
In this paper, we propose an inexact block coordinate descent algorithm for
large-scale nonsmooth nonconvex optimization problems. At each iteration, a
particular block variable is selected and updated by inexactly solving the
original optimization problem with respect to that block variable. More
precisely, a local approximation of the original optimization problem is
solved. The proposed algorithm has several attractive features, namely, i) high
flexibility, as the approximation function only needs to be strictly convex and
it does not have to be a global upper bound of the original function; ii) fast
convergence, as the approximation function can be designed to exploit the
problem structure at hand and the stepsize is calculated by the line search;
iii) low complexity, as the approximation subproblems are much easier to solve
and the line search scheme is carried out over a properly constructed
differentiable function; iv) guaranteed convergence of a subsequence to a
stationary point, even when the objective function does not have a Lipschitz
continuous gradient. Interestingly, when the approximation subproblem is solved
by a descent algorithm, convergence of a subsequence to a stationary point is
still guaranteed even if the approximation subproblem is solved inexactly by
terminating the descent algorithm after a finite number of iterations. These
features make the proposed algorithm suitable for large-scale problems where
the dimension exceeds the memory and/or the processing capability of the
existing hardware. These features are also illustrated by several applications
in signal processing and machine learning, for instance, network anomaly
detection and phase retrieval
Successive Concave Sparsity Approximation for Compressed Sensing
In this paper, based on a successively accuracy-increasing approximation of
the norm, we propose a new algorithm for recovery of sparse vectors
from underdetermined measurements. The approximations are realized with a
certain class of concave functions that aggressively induce sparsity and their
closeness to the norm can be controlled. We prove that the series of
the approximations asymptotically coincides with the and
norms when the approximation accuracy changes from the worst fitting to the
best fitting. When measurements are noise-free, an optimization scheme is
proposed which leads to a number of weighted minimization programs,
whereas, in the presence of noise, we propose two iterative thresholding
methods that are computationally appealing. A convergence guarantee for the
iterative thresholding method is provided, and, for a particular function in
the class of the approximating functions, we derive the closed-form
thresholding operator. We further present some theoretical analyses via the
restricted isometry, null space, and spherical section properties. Our
extensive numerical simulations indicate that the proposed algorithm closely
follows the performance of the oracle estimator for a range of sparsity levels
wider than those of the state-of-the-art algorithms.Comment: Submitted to IEEE Trans. on Signal Processin
Optimization with Sparsity-Inducing Penalties
Sparse estimation methods are aimed at using or obtaining parsimonious
representations of data or models. They were first dedicated to linear variable
selection but numerous extensions have now emerged such as structured sparsity
or kernel selection. It turns out that many of the related estimation problems
can be cast as convex optimization problems by regularizing the empirical risk
with appropriate non-smooth norms. The goal of this paper is to present from a
general perspective optimization tools and techniques dedicated to such
sparsity-inducing penalties. We cover proximal methods, block-coordinate
descent, reweighted -penalized techniques, working-set and homotopy
methods, as well as non-convex formulations and extensions, and provide an
extensive set of experiments to compare various algorithms from a computational
point of view
Convex and Network Flow Optimization for Structured Sparsity
We consider a class of learning problems regularized by a structured
sparsity-inducing norm defined as the sum of l_2- or l_infinity-norms over
groups of variables. Whereas much effort has been put in developing fast
optimization techniques when the groups are disjoint or embedded in a
hierarchy, we address here the case of general overlapping groups. To this end,
we present two different strategies: On the one hand, we show that the proximal
operator associated with a sum of l_infinity-norms can be computed exactly in
polynomial time by solving a quadratic min-cost flow problem, allowing the use
of accelerated proximal gradient methods. On the other hand, we use proximal
splitting techniques, and address an equivalent formulation with
non-overlapping groups, but in higher dimension and with additional
constraints. We propose efficient and scalable algorithms exploiting these two
strategies, which are significantly faster than alternative approaches. We
illustrate these methods with several problems such as CUR matrix
factorization, multi-task learning of tree-structured dictionaries, background
subtraction in video sequences, image denoising with wavelets, and topographic
dictionary learning of natural image patches.Comment: to appear in the Journal of Machine Learning Research (JMLR
- …