123 research outputs found
Parallel Selective Algorithms for Big Data Optimization
We propose a decomposition framework for the parallel optimization of the sum
of a differentiable (possibly nonconvex) function and a (block) separable
nonsmooth, convex one. The latter term is usually employed to enforce structure
in the solution, typically sparsity. Our framework is very flexible and
includes both fully parallel Jacobi schemes and Gauss- Seidel (i.e.,
sequential) ones, as well as virtually all possibilities "in between" with only
a subset of variables updated at each iteration. Our theoretical convergence
results improve on existing ones, and numerical results on LASSO, logistic
regression, and some nonconvex quadratic problems show that the new method
consistently outperforms existing algorithms.Comment: This work is an extended version of the conference paper that has
been presented at IEEE ICASSP'14. The first and the second author contributed
equally to the paper. This revised version contains new numerical results on
non convex quadratic problem
Euclid in a Taxicab: Sparse Blind Deconvolution with Smoothed l1/l2 Regularization
The l1/l2 ratio regularization function has shown good performance for
retrieving sparse signals in a number of recent works, in the context of blind
deconvolution. Indeed, it benefits from a scale invariance property much
desirable in the blind context. However, the l1/l2 function raises some
difficulties when solving the nonconvex and nonsmooth minimization problems
resulting from the use of such a penalty term in current restoration methods.
In this paper, we propose a new penalty based on a smooth approximation to the
l1/l2 function. In addition, we develop a proximal-based algorithm to solve
variational problems involving this function and we derive theoretical
convergence results. We demonstrate the effectiveness of our method through a
comparison with a recent alternating optimization strategy dealing with the
exact l1/l2 term, on an application to seismic data blind deconvolution.Comment: 5 page
Flexible Parallel Algorithms for Big Data Optimization
We propose a decomposition framework for the parallel optimization of the sum
of a differentiable function and a (block) separable nonsmooth, convex one. The
latter term is typically used to enforce structure in the solution as, for
example, in Lasso problems. Our framework is very flexible and includes both
fully parallel Jacobi schemes and Gauss-Seidel (Southwell-type) ones, as well
as virtually all possibilities in between (e.g., gradient- or Newton-type
methods) with only a subset of variables updated at each iteration. Our
theoretical convergence results improve on existing ones, and numerical results
show that the new method compares favorably to existing algorithms.Comment: submitted to IEEE ICASSP 201
Hybrid Random/Deterministic Parallel Algorithms for Nonconvex Big Data Optimization
We propose a decomposition framework for the parallel optimization of the sum
of a differentiable {(possibly nonconvex)} function and a nonsmooth (possibly
nonseparable), convex one. The latter term is usually employed to enforce
structure in the solution, typically sparsity. The main contribution of this
work is a novel \emph{parallel, hybrid random/deterministic} decomposition
scheme wherein, at each iteration, a subset of (block) variables is updated at
the same time by minimizing local convex approximations of the original
nonconvex function. To tackle with huge-scale problems, the (block) variables
to be updated are chosen according to a \emph{mixed random and deterministic}
procedure, which captures the advantages of both pure deterministic and random
update-based schemes. Almost sure convergence of the proposed scheme is
established. Numerical results show that on huge-scale problems the proposed
hybrid random/deterministic algorithm outperforms both random and deterministic
schemes.Comment: The order of the authors is alphabetica
Cyclic Block Coordinate Descent With Variance Reduction for Composite Nonconvex Optimization
Nonconvex optimization is central in solving many machine learning problems,
in which block-wise structure is commonly encountered. In this work, we propose
cyclic block coordinate methods for nonconvex optimization problems with
non-asymptotic gradient norm guarantees. Our convergence analysis is based on a
gradient Lipschitz condition with respect to a Mahalanobis norm, inspired by a
recent progress on cyclic block coordinate methods. In deterministic settings,
our convergence guarantee matches the guarantee of (full-gradient) gradient
descent, but with the gradient Lipschitz constant being defined w.r.t.~the
Mahalanobis norm. In stochastic settings, we use recursive variance reduction
to decrease the per-iteration cost and match the arithmetic operation
complexity of current optimal stochastic full-gradient methods, with a unified
analysis for both finite-sum and infinite-sum cases. We further prove the
faster, linear convergence of our methods when a Polyak-{\L}ojasiewicz (P{\L})
condition holds for the objective function. To the best of our knowledge, our
work is the first to provide variance-reduced convergence guarantees for a
cyclic block coordinate method. Our experimental results demonstrate the
efficacy of the proposed variance-reduced cyclic scheme in training deep neural
nets
- …