364 research outputs found
Low Complexity Regularization of Linear Inverse Problems
Inverse problems and regularization theory is a central theme in contemporary
signal processing, where the goal is to reconstruct an unknown signal from
partial indirect, and possibly noisy, measurements of it. A now standard method
for recovering the unknown signal is to solve a convex optimization problem
that enforces some prior knowledge about its structure. This has proved
efficient in many problems routinely encountered in imaging sciences,
statistics and machine learning. This chapter delivers a review of recent
advances in the field where the regularization prior promotes solutions
conforming to some notion of simplicity/low-complexity. These priors encompass
as popular examples sparsity and group sparsity (to capture the compressibility
of natural signals and images), total variation and analysis sparsity (to
promote piecewise regularity), and low-rank (as natural extension of sparsity
to matrix-valued data). Our aim is to provide a unified treatment of all these
regularizations under a single umbrella, namely the theory of partial
smoothness. This framework is very general and accommodates all low-complexity
regularizers just mentioned, as well as many others. Partial smoothness turns
out to be the canonical way to encode low-dimensional models that can be linear
spaces or more general smooth manifolds. This review is intended to serve as a
one stop shop toward the understanding of the theoretical properties of the
so-regularized solutions. It covers a large spectrum including: (i) recovery
guarantees and stability to noise, both in terms of -stability and
model (manifold) identification; (ii) sensitivity analysis to perturbations of
the parameters involved (in particular the observations), with applications to
unbiased risk estimation ; (iii) convergence properties of the forward-backward
proximal splitting scheme, that is particularly well suited to solve the
corresponding large-scale regularized optimization problem
A D.C. Programming Approach to the Sparse Generalized Eigenvalue Problem
In this paper, we consider the sparse eigenvalue problem wherein the goal is
to obtain a sparse solution to the generalized eigenvalue problem. We achieve
this by constraining the cardinality of the solution to the generalized
eigenvalue problem and obtain sparse principal component analysis (PCA), sparse
canonical correlation analysis (CCA) and sparse Fisher discriminant analysis
(FDA) as special cases. Unlike the -norm approximation to the
cardinality constraint, which previous methods have used in the context of
sparse PCA, we propose a tighter approximation that is related to the negative
log-likelihood of a Student's t-distribution. The problem is then framed as a
d.c. (difference of convex functions) program and is solved as a sequence of
convex programs by invoking the majorization-minimization method. The resulting
algorithm is proved to exhibit \emph{global convergence} behavior, i.e., for
any random initialization, the sequence (subsequence) of iterates generated by
the algorithm converges to a stationary point of the d.c. program. The
performance of the algorithm is empirically demonstrated on both sparse PCA
(finding few relevant genes that explain as much variance as possible in a
high-dimensional gene dataset) and sparse CCA (cross-language document
retrieval and vocabulary selection for music retrieval) applications.Comment: 40 page
Fast Low-rank Representation based Spatial Pyramid Matching for Image Classification
Spatial Pyramid Matching (SPM) and its variants have achieved a lot of
success in image classification. The main difference among them is their
encoding schemes. For example, ScSPM incorporates Sparse Code (SC) instead of
Vector Quantization (VQ) into the framework of SPM. Although the methods
achieve a higher recognition rate than the traditional SPM, they consume more
time to encode the local descriptors extracted from the image. In this paper,
we propose using Low Rank Representation (LRR) to encode the descriptors under
the framework of SPM. Different from SC, LRR considers the group effect among
data points instead of sparsity. Benefiting from this property, the proposed
method (i.e., LrrSPM) can offer a better performance. To further improve the
generalizability and robustness, we reformulate the rank-minimization problem
as a truncated projection problem. Extensive experimental studies show that
LrrSPM is more efficient than its counterparts (e.g., ScSPM) while achieving
competitive recognition rates on nine image data sets.Comment: accepted into knowledge based systems, 201
Kernel Truncated Regression Representation for Robust Subspace Clustering
Subspace clustering aims to group data points into multiple clusters of which
each corresponds to one subspace. Most existing subspace clustering approaches
assume that input data lie on linear subspaces. In practice, however, this
assumption usually does not hold. To achieve nonlinear subspace clustering, we
propose a novel method, called kernel truncated regression representation. Our
method consists of the following four steps: 1) projecting the input data into
a hidden space, where each data point can be linearly represented by other data
points; 2) calculating the linear representation coefficients of the data
representations in the hidden space; 3) truncating the trivial coefficients to
achieve robustness and block-diagonality; and 4) executing the graph cutting
operation on the coefficient matrix by solving a graph Laplacian problem. Our
method has the advantages of a closed-form solution and the capacity of
clustering data points that lie on nonlinear subspaces. The first advantage
makes our method efficient in handling large-scale datasets, and the second one
enables the proposed method to conquer the nonlinear subspace clustering
challenge. Extensive experiments on six benchmarks demonstrate the
effectiveness and the efficiency of the proposed method in comparison with
current state-of-the-art approaches.Comment: 14 page
DC Proximal Newton for Non-Convex Optimization Problems
We introduce a novel algorithm for solving learning problems where both the
loss function and the regularizer are non-convex but belong to the class of
difference of convex (DC) functions. Our contribution is a new general purpose
proximal Newton algorithm that is able to deal with such a situation. The
algorithm consists in obtaining a descent direction from an approximation of
the loss function and then in performing a line search to ensure sufficient
descent. A theoretical analysis is provided showing that the iterates of the
proposed algorithm {admit} as limit points stationary points of the DC
objective function. Numerical experiments show that our approach is more
efficient than current state of the art for a problem with a convex loss
functions and non-convex regularizer. We have also illustrated the benefit of
our algorithm in high-dimensional transductive learning problem where both loss
function and regularizers are non-convex
Optimization with Sparsity-Inducing Penalties
Sparse estimation methods are aimed at using or obtaining parsimonious
representations of data or models. They were first dedicated to linear variable
selection but numerous extensions have now emerged such as structured sparsity
or kernel selection. It turns out that many of the related estimation problems
can be cast as convex optimization problems by regularizing the empirical risk
with appropriate non-smooth norms. The goal of this paper is to present from a
general perspective optimization tools and techniques dedicated to such
sparsity-inducing penalties. We cover proximal methods, block-coordinate
descent, reweighted -penalized techniques, working-set and homotopy
methods, as well as non-convex formulations and extensions, and provide an
extensive set of experiments to compare various algorithms from a computational
point of view
- …