6 research outputs found
Catalyst Acceleration for First-order Convex Optimization: from Theory to Practice
We introduce a generic scheme for accelerating gradient-based optimization
methods in the sense of Nesterov. The approach, called Catalyst, builds upon
the inexact accelerated proximal point algorithm for minimizing a convex
objective function, and consists of approximately solving a sequence of
well-chosen auxiliary problems, leading to faster convergence. One of the keys
to achieve acceleration in theory and in practice is to solve these
sub-problems with appropriate accuracy by using the right stopping criterion
and the right warm-start strategy. We give practical guidelines to use Catalyst
and present a comprehensive analysis of its global complexity. We show that
Catalyst applies to a large class of algorithms, including gradient descent,
block coordinate descent, incremental algorithms such as SAG, SAGA, SDCA, SVRG,
MISO/Finito, and their proximal variants. For all of these methods, we
establish faster rates using the Catalyst acceleration, for strongly convex and
non-strongly convex objectives. We conclude with extensive experiments showing
that acceleration is useful in practice, especially for ill-conditioned
problems.Comment: link to publisher website:
http://jmlr.org/papers/volume18/17-748/17-748.pd
Principled Analyses and Design of First-Order Methods with Inexact Proximal Operators
Proximal operations are among the most common primitives appearing in both
practical and theoretical (or high-level) optimization methods. This basic
operation typically consists in solving an intermediary (hopefully simpler)
optimization problem. In this work, we survey notions of inaccuracies that can
be used when solving those intermediary optimization problems. Then, we show
that worst-case guarantees for algorithms relying on such inexact proximal
operations can be systematically obtained through a generic procedure based on
semidefinite programming. This methodology is primarily based on the approach
introduced by Drori and Teboulle (Mathematical Programming, 2014) and on convex
interpolation results, and allows producing non-improvable worst-case analyzes.
In other words, for a given algorithm, the methodology generates both
worst-case certificates (i.e., proofs) and problem instances on which those
bounds are achieved.
Relying on this methodology, we provide three new methods with conceptually
simple proofs: (i) an optimized relatively inexact proximal point method, (ii)
an extension of the hybrid proximal extragradient method of Monteiro and
Svaiter (SIAM Journal on Optimization, 2013), and (iii) an inexact accelerated
forward-backward splitting supporting backtracking line-search, and both (ii)
and (iii) supporting possibly strongly convex objectives. Finally, we use the
methodology for studying a recent inexact variant of the Douglas-Rachford
splitting due to Eckstein and Yao (Mathematical Programming, 2018).
We showcase and compare the different variants of the accelerated inexact
forward-backward method on a factorization and a total variation problem.Comment: Minor modifications including acknowledgments and references. Code
available at https://github.com/mathbarre/InexactProximalOperator
Inexact Proximal-Gradient Methods with Support Identification
We consider the proximal-gradient method for minimizing an objective function
that is the sum of a smooth function and a non-smooth convex function. A
feature that distinguishes our work from most in the literature is that we
assume that the associated proximal operator does not admit a closed-form
solution. To address this challenge, we study two adaptive and implementable
termination conditions that dictate how accurately the proximal-gradient
subproblem is solved. We prove that the number of iterations required for the
inexact proximal-gradient method to reach a approximate first-order
stationary point is , which matches the similar result
that holds when exact subproblem solutions are computed. Also, by focusing on
the overlapping group regularizer, we propose an algorithm for
approximately solving the proximal-gradient subproblem, and then prove that its
iterates identify (asymptotically) the support of an optimal solution. If one
imposes additional control over the accuracy to which each subproblem is
solved, we give an upper bound on the maximum number of iterations before the
support of an optimal solution is obtained
On the Proximal Gradient Algorithm with Alternated Inertia
International audienceIn this paper, we investigate the attractive properties of the proximal gradient algorithm with inertia. Notably, we show that using alternated inertia yields monotonically decreasing functional values, which contrasts with usual accelerated proximal gradient methods. We also provide convergence rates for the algorithm with alternated inertia based on local geometric properties of the objective function. The results are put into perspective by discussions on several extensions and illustrations on common regularized problems
Descentwise inexact proximal algorithms for smooth optimization
International audienceThe proximal method is a standard regularization approach in optimization. Practical implementations of this algorithm require (i) an algorithm to compute the proximal point, (ii) a rule to stop this algorithm, (iii) an update formula for the proximal parameter. In this work we focus on (ii), when smoothness is present - so that Newton-like methods can be used for (i): we aim at giving adequate stopping rules to reach overall efficiency of the method. Roughly speaking, usual rules consist in stopping inner iterations when the current iterate is close to the proximal point. By contrast, we use the standard paradigm of numerical optimization: the basis for our stopping test is a "sufficient" decrease of the objective function, namely a fraction of the ideal decrease. We establish convergence of the algorithm thus obtained and we illustrate it on some ill-conditioned functions. The experiments show that combining a standard smooth optimization algorithm with the proposed inexact proximal scheme improves numerical behaviour for those problems