49 research outputs found
Adaptive Quasi-Newton and Anderson Acceleration Framework with Explicit Global (Accelerated) Convergence Rates
Despite the impressive numerical performance of quasi-Newton and
Anderson/nonlinear acceleration methods, their global convergence rates have
remained elusive for over 50 years. This paper addresses this long-standing
question by introducing a framework that derives novel and adaptive
quasi-Newton or nonlinear/Anderson acceleration schemes. Under mild
assumptions, the proposed iterative methods exhibit explicit, non-asymptotic
convergence rates that blend those of gradient descent and Cubic Regularized
Newton's method. Notably, these rates are achieved adaptively, as the method
autonomously determines the optimal step size using a simple backtracking
strategy. The proposed approach also includes an accelerated version that
improves the convergence rate on convex functions. Numerical experiments
demonstrate the efficiency of the proposed framework, even compared to a
fine-tuned BFGS algorithm with line search
Regularized Nonlinear Acceleration
We describe a convergence acceleration technique for unconstrained
optimization problems. Our scheme computes estimates of the optimum from a
nonlinear average of the iterates produced by any optimization method. The
weights in this average are computed via a simple linear system, whose solution
can be updated online. This acceleration scheme runs in parallel to the base
algorithm, providing improved estimates of the solution on the fly, while the
original optimization method is running. Numerical experiments are detailed on
classical classification problems
Average-case Acceleration Through Spectral Density Estimation
We develop a framework for the average-case analysis of random quadratic
problems and derive algorithms that are optimal under this analysis. This
yields a new class of methods that achieve acceleration given a model of the
Hessian's eigenvalue distribution. We develop explicit algorithms for the
uniform, Marchenko-Pastur, and exponential distributions. These methods are
momentum-based algorithms, whose hyper-parameters can be estimated without
knowledge of the Hessian's smallest singular value, in contrast with classical
accelerated methods like Nesterov acceleration and Polyak momentum. Through
empirical benchmarks on quadratic and logistic regression problems, we identify
regimes in which the the proposed methods improve over classical (worst-case)
accelerated methods.Comment: Since last version, we simplified proof of Theorem 3.
Regularized Nonlinear Acceleration
International audienceWe describe a convergence acceleration technique for generic optimization problems. Our schemecomputes estimates of the optimum from a nonlinear average of the iterates produced by any optimizationmethod. The weights in this average are computed via a simple linear system, whose solution can be updatedonline. This acceleration scheme runs in parallel to the base algorithm, providing improved estimates of thesolution on the fly, while the original optimization method is running. Numerical experiments are detailed onclassical classification problems
Acceleration Methods
This monograph covers some recent advances in a range of acceleration
techniques frequently used in convex optimization. We first use quadratic
optimization problems to introduce two key families of methods, namely momentum
and nested optimization schemes. They coincide in the quadratic case to form
the Chebyshev method. We discuss momentum methods in detail, starting with the
seminal work of Nesterov and structure convergence proofs using a few master
templates, such as that for optimized gradient methods, which provide the key
benefit of showing how momentum methods optimize convergence guarantees. We
further cover proximal acceleration, at the heart of the Catalyst and
Accelerated Hybrid Proximal Extragradient frameworks, using similar algorithmic
patterns. Common acceleration techniques rely directly on the knowledge of some
of the regularity parameters in the problem at hand. We conclude by discussing
restart schemes, a set of simple techniques for reaching nearly optimal
convergence rates while adapting to unobserved regularity parameters.Comment: Published in Foundation and Trends in Optimization (see
https://www.nowpublishers.com/article/Details/OPT-036