6 research outputs found

    Catalyst Acceleration for First-order Convex Optimization: from Theory to Practice

    Full text link
    We introduce a generic scheme for accelerating gradient-based optimization methods in the sense of Nesterov. The approach, called Catalyst, builds upon the inexact accelerated proximal point algorithm for minimizing a convex objective function, and consists of approximately solving a sequence of well-chosen auxiliary problems, leading to faster convergence. One of the keys to achieve acceleration in theory and in practice is to solve these sub-problems with appropriate accuracy by using the right stopping criterion and the right warm-start strategy. We give practical guidelines to use Catalyst and present a comprehensive analysis of its global complexity. We show that Catalyst applies to a large class of algorithms, including gradient descent, block coordinate descent, incremental algorithms such as SAG, SAGA, SDCA, SVRG, MISO/Finito, and their proximal variants. For all of these methods, we establish faster rates using the Catalyst acceleration, for strongly convex and non-strongly convex objectives. We conclude with extensive experiments showing that acceleration is useful in practice, especially for ill-conditioned problems.Comment: link to publisher website: http://jmlr.org/papers/volume18/17-748/17-748.pd

    Principled Analyses and Design of First-Order Methods with Inexact Proximal Operators

    Full text link
    Proximal operations are among the most common primitives appearing in both practical and theoretical (or high-level) optimization methods. This basic operation typically consists in solving an intermediary (hopefully simpler) optimization problem. In this work, we survey notions of inaccuracies that can be used when solving those intermediary optimization problems. Then, we show that worst-case guarantees for algorithms relying on such inexact proximal operations can be systematically obtained through a generic procedure based on semidefinite programming. This methodology is primarily based on the approach introduced by Drori and Teboulle (Mathematical Programming, 2014) and on convex interpolation results, and allows producing non-improvable worst-case analyzes. In other words, for a given algorithm, the methodology generates both worst-case certificates (i.e., proofs) and problem instances on which those bounds are achieved. Relying on this methodology, we provide three new methods with conceptually simple proofs: (i) an optimized relatively inexact proximal point method, (ii) an extension of the hybrid proximal extragradient method of Monteiro and Svaiter (SIAM Journal on Optimization, 2013), and (iii) an inexact accelerated forward-backward splitting supporting backtracking line-search, and both (ii) and (iii) supporting possibly strongly convex objectives. Finally, we use the methodology for studying a recent inexact variant of the Douglas-Rachford splitting due to Eckstein and Yao (Mathematical Programming, 2018). We showcase and compare the different variants of the accelerated inexact forward-backward method on a factorization and a total variation problem.Comment: Minor modifications including acknowledgments and references. Code available at https://github.com/mathbarre/InexactProximalOperator

    Inexact Proximal-Gradient Methods with Support Identification

    Full text link
    We consider the proximal-gradient method for minimizing an objective function that is the sum of a smooth function and a non-smooth convex function. A feature that distinguishes our work from most in the literature is that we assume that the associated proximal operator does not admit a closed-form solution. To address this challenge, we study two adaptive and implementable termination conditions that dictate how accurately the proximal-gradient subproblem is solved. We prove that the number of iterations required for the inexact proximal-gradient method to reach a τ>0\tau > 0 approximate first-order stationary point is O(τ−2)\mathcal{O}(\tau^{-2}), which matches the similar result that holds when exact subproblem solutions are computed. Also, by focusing on the overlapping group ℓ1\ell_1 regularizer, we propose an algorithm for approximately solving the proximal-gradient subproblem, and then prove that its iterates identify (asymptotically) the support of an optimal solution. If one imposes additional control over the accuracy to which each subproblem is solved, we give an upper bound on the maximum number of iterations before the support of an optimal solution is obtained

    On the Proximal Gradient Algorithm with Alternated Inertia

    Get PDF
    International audienceIn this paper, we investigate the attractive properties of the proximal gradient algorithm with inertia. Notably, we show that using alternated inertia yields monotonically decreasing functional values, which contrasts with usual accelerated proximal gradient methods. We also provide convergence rates for the algorithm with alternated inertia based on local geometric properties of the objective function. The results are put into perspective by discussions on several extensions and illustrations on common regularized problems

    Descentwise inexact proximal algorithms for smooth optimization

    Get PDF
    International audienceThe proximal method is a standard regularization approach in optimization. Practical implementations of this algorithm require (i) an algorithm to compute the proximal point, (ii) a rule to stop this algorithm, (iii) an update formula for the proximal parameter. In this work we focus on (ii), when smoothness is present - so that Newton-like methods can be used for (i): we aim at giving adequate stopping rules to reach overall efficiency of the method. Roughly speaking, usual rules consist in stopping inner iterations when the current iterate is close to the proximal point. By contrast, we use the standard paradigm of numerical optimization: the basis for our stopping test is a "sufficient" decrease of the objective function, namely a fraction of the ideal decrease. We establish convergence of the algorithm thus obtained and we illustrate it on some ill-conditioned functions. The experiments show that combining a standard smooth optimization algorithm with the proposed inexact proximal scheme improves numerical behaviour for those problems
    corecore