490 research outputs found
Global and Quadratic Convergence of Newton Hard-Thresholding Pursuit
Algorithms based on the hard thresholding principle have been well studied
with sounding theoretical guarantees in the compressed sensing and more general
sparsity-constrained optimization. It is widely observed in existing empirical
studies that when a restricted Newton step was used (as the debiasing step),
the hard-thresholding algorithms tend to meet halting conditions in a
significantly low number of iterations and are very efficient. Hence, the thus
obtained Newton hard-thresholding algorithms call for stronger theoretical
guarantees than for their simple hard-thresholding counterparts. This paper
provides a theoretical justification for the use of the restricted Newton step.
We build our theory and algorithm, Newton Hard-Thresholding Pursuit (NHTP), for
the sparsity-constrained optimization. Our main result shows that NHTP is
quadratically convergent under the standard assumption of restricted strong
convexity and smoothness. We also establish its global convergence to a
stationary point under a weaker assumption. In the special case of the
compressive sensing, NHTP effectively reduces to some of the existing
hard-thresholding algorithms with a Newton step. Consequently, our fast
convergence result justifies why those algorithms perform better than without
the Newton step. The efficiency of NHTP was demonstrated on both synthetic and
real data in compressed sensing and sparse logistic regression
Activity Identification and Local Linear Convergence of Forward--Backward-type methods
In this paper, we consider a class of Forward--Backward (FB) splitting
methods that includes several variants (e.g. inertial schemes, FISTA) for
minimizing the sum of two proper convex and lower semi-continuous functions,
one of which has a Lipschitz continuous gradient, and the other is partly
smooth relatively to a smooth active manifold . We propose a
unified framework, under which we show that, this class of FB-type algorithms
(i) correctly identifies the active manifolds in a finite number of iterations
(finite activity identification), and (ii) then enters a local linear
convergence regime, which we characterize precisely in terms of the structure
of the underlying active manifolds. For simpler problems involving polyhedral
functions, we show finite termination. We also establish and explain why FISTA
(with convergent sequences) locally oscillates and can be slower than FB. These
results may have numerous applications including in signal/image processing,
sparse recovery and machine learning. Indeed, the obtained results explain the
typical behaviour that has been observed numerically for many problems in these
fields such as the Lasso, the group Lasso, the fused Lasso and the nuclear norm
regularization to name only a few.Comment: Full length version of the previous short on
Gaussian Mixtures Based IRLS for Sparse Recovery With Quadratic Convergence
In this paper, we propose a new class of iteratively
re-weighted least squares (IRLS) for sparse recovery problems.
The proposed methods are inspired by constrained maximum-likelihood
estimation under a Gaussian scale mixture (GSM) distribution
assumption. In the noise-free setting, we provide sufficient
conditions ensuring the convergence of the sequences generated by
these algorithms to the set of fixed points of the maps that rule their
dynamics and derive conditions verifiable a posteriori for the convergence
to a sparse solution. We further prove that these algorithms
are quadratically fast in a neighborhood of a sparse solution.
We show through numerical experiments that the proposed
methods outperform classical IRLS for l_p-minimization with p\in(0,1]
in terms of speed and of sparsity-undersampling tradeoff
and are robust even in presence of noise. The simplicity and the
theoretical guarantees provided in this paper make this class of algorithms
an attractive solution for sparse recovery problems
A D.C. Programming Approach to the Sparse Generalized Eigenvalue Problem
In this paper, we consider the sparse eigenvalue problem wherein the goal is
to obtain a sparse solution to the generalized eigenvalue problem. We achieve
this by constraining the cardinality of the solution to the generalized
eigenvalue problem and obtain sparse principal component analysis (PCA), sparse
canonical correlation analysis (CCA) and sparse Fisher discriminant analysis
(FDA) as special cases. Unlike the -norm approximation to the
cardinality constraint, which previous methods have used in the context of
sparse PCA, we propose a tighter approximation that is related to the negative
log-likelihood of a Student's t-distribution. The problem is then framed as a
d.c. (difference of convex functions) program and is solved as a sequence of
convex programs by invoking the majorization-minimization method. The resulting
algorithm is proved to exhibit \emph{global convergence} behavior, i.e., for
any random initialization, the sequence (subsequence) of iterates generated by
the algorithm converges to a stationary point of the d.c. program. The
performance of the algorithm is empirically demonstrated on both sparse PCA
(finding few relevant genes that explain as much variance as possible in a
high-dimensional gene dataset) and sparse CCA (cross-language document
retrieval and vocabulary selection for music retrieval) applications.Comment: 40 page
- …