Search CORE

1,447 research outputs found

Adaptive Quasi-Newton and Anderson Acceleration Framework with Explicit Global (Accelerated) Convergence Rates

Author: Scieur Damien
Publication venue
Publication date: 30/05/2023
Field of study

Despite the impressive numerical performance of quasi-Newton and Anderson/nonlinear acceleration methods, their global convergence rates have remained elusive for over 50 years. This paper addresses this long-standing question by introducing a framework that derives novel and adaptive quasi-Newton or nonlinear/Anderson acceleration schemes. Under mild assumptions, the proposed iterative methods exhibit explicit, non-asymptotic convergence rates that blend those of gradient descent and Cubic Regularized Newton's method. Notably, these rates are achieved adaptively, as the method autonomously determines the optimal step size using a simple backtracking strategy. The proposed approach also includes an accelerated version that improves the convergence rate on convex functions. Numerical experiments demonstrate the efficiency of the proposed framework, even compared to a fine-tuned BFGS algorithm with line search

arXiv.org e-Print Archive

Stochastic Trust Region Methods with Trust Region Radius Depending on Probabilistic Models

Author: Wang Xiaoyu
Yuan Ya-xiang
Publication venue
Publication date: 13/09/2019
Field of study

We present a stochastic trust-region model-based framework in which its radius is related to the probabilistic models. Especially, we propose a specific algorithm, termed STRME, in which the trust-region radius depends linearly on the latest model gradient. The complexity of STRME method in non-convex, convex and strongly convex settings has all been analyzed, which matches the existing algorithms based on probabilistic properties. In addition, several numerical experiments are carried out to reveal the benefits of the proposed methods compared to the existing stochastic trust-region methods and other relevant stochastic gradient methods

arXiv.org e-Print Archive

Newton-Type Methods for Non-Convex Optimization Under Inexact Hessian Information

Author: Mahoney Michael W.
Roosta Fred
Xu Peng
Publication venue
Publication date: 14/05/2019
Field of study

We consider variants of trust-region and cubic regularization methods for non-convex optimization, in which the Hessian matrix is approximated. Under mild conditions on the inexact Hessian, and using approximate solution of the corresponding sub-problems, we provide iteration complexity to achieve

\epsilon

-approximate second-order optimality which have shown to be tight. Our Hessian approximation conditions constitute a major relaxation over the existing ones in the literature. Consequently, we are able to show that such mild conditions allow for the construction of the approximate Hessian through various random sampling methods. In this light, we consider the canonical problem of finite-sum minimization, provide appropriate uniform and non-uniform sub-sampling strategies to construct such Hessian approximations, and obtain optimal iteration complexity for the corresponding sub-sampled trust-region and cubic regularization methods.Comment: 32 page

arXiv.org e-Print Archive

University of Queensland eSpace

Cubic Regularization is the Key! The First Accelerated Quasi-Newton Method with a Global Convergence Rate of $O(k^{-2})$ for Convex Functions

Author: Agafonov Artem
Kamzolov Dmitry
Takáč Martin
Ziu Klea
Publication venue
Publication date: 28/05/2023
Field of study

In this paper, we propose the first Quasi-Newton method with a global convergence rate of

O(k^{-1})

for general convex functions. Quasi-Newton methods, such as BFGS, SR-1, are well-known for their impressive practical performance. However, they may be slower than gradient descent for general convex functions, with the best theoretical rate of

O(k^{-1/3})

. This gap between impressive practical performance and poor theoretical guarantees was an open question for a long period of time. In this paper, we make a significant step to close this gap. We improve upon the existing rate and propose the Cubic Regularized Quasi-Newton Method with a convergence rate of

O(k^{-1})

. The key to achieving this improvement is to use the Cubic Regularized Newton Method over the Damped Newton Method as an outer method, where the Quasi-Newton update is an inexact Hessian approximation. Using this approach, we propose the first Accelerated Quasi-Newton method with a global convergence rate of

O(k^{-2})

for general convex functions. In special cases where we can improve the precision of the approximation, we achieve a global convergence rate of

O(k^{-3})

, which is faster than any first-order method. To make these methods practical, we introduce the Adaptive Inexact Cubic Regularized Newton Method and its accelerated version, which provide real-time control of the approximation error. We show that the proposed methods have impressive practical performance and outperform both first and second-order methods

arXiv.org e-Print Archive

Newton-MR: Inexact Newton Method With Minimum Residual Sub-problem Solver

Author: Liu Yang
Mahoney Michael W.
Roosta Fred
Xu Peng
Publication venue
Publication date: 15/10/2021
Field of study

We consider a variant of inexact Newton Method, called Newton-MR, in which the least-squares sub-problems are solved approximately using Minimum Residual method. By construction, Newton-MR can be readily applied for unconstrained optimization of a class of non-convex problems known as invex, which subsumes convexity as a sub-class. For invex optimization, instead of the classical Lipschitz continuity assumptions on gradient and Hessian, Newton-MR's global convergence can be guaranteed under a weaker notion of joint regularity of Hessian and gradient. We also obtain Newton-MR's problem-independent local convergence to the set of minima. We show that fast local/global convergence can be guaranteed under a novel inexactness condition, which, to our knowledge, is much weaker than the prior related works. Numerical results demonstrate the performance of Newton-MR as compared with several other Newton-type alternatives on a few machine learning problems.Comment: 35 page

arXiv.org e-Print Archive