Search CORE

633 research outputs found

Limited-Memory Greedy Quasi-Newton Method with Non-asymptotic Superlinear Convergence Rate

Author: Gao Zhan
Koppel Alec
Mokhtari Aryan
Publication venue
Publication date: 27/06/2023
Field of study

Non-asymptotic convergence analysis of quasi-Newton methods has gained attention with a landmark result establishing an explicit superlinear rate of O

((1/\sqrt{t})^t)

. The methods that obtain this rate, however, exhibit a well-known drawback: they require the storage of the previous Hessian approximation matrix or instead storing all past curvature information to form the current Hessian inverse approximation. Limited-memory variants of quasi-Newton methods such as the celebrated L-BFGS alleviate this issue by leveraging a limited window of past curvature information to construct the Hessian inverse approximation. As a result, their per iteration complexity and storage requirement is O

(\tau d)

where

\tau \le d

is the size of the window and

d

is the problem dimension reducing the O

(d^2)

computational cost and memory requirement of standard quasi-Newton methods. However, to the best of our knowledge, there is no result showing a non-asymptotic superlinear convergence rate for any limited-memory quasi-Newton method. In this work, we close this gap by presenting a limited-memory greedy BFGS (LG-BFGS) method that achieves an explicit non-asymptotic superlinear rate. We incorporate displacement aggregation, i.e., decorrelating projection, in post-processing gradient variations, together with a basis vector selection scheme on variable variations, which greedily maximizes a progress measure of the Hessian estimate to the true Hessian. Their combination allows past curvature information to remain in a sparse subspace while yielding a valid representation of the full history. Interestingly, our established non-asymptotic superlinear convergence rate demonstrates a trade-off between the convergence speed and memory requirement, which to our knowledge, is the first of its kind. Numerical results corroborate our theoretical findings and demonstrate the effectiveness of our method

arXiv.org e-Print Archive

Online Learning Guided Curvature Approximation: A Quasi-Newton Method with Global Non-Asymptotic Superlinear Convergence

Author: Jiang Ruichen
Jin Qiujiang
Mokhtari Aryan
Publication venue
Publication date: 25/07/2023
Field of study

Quasi-Newton algorithms are among the most popular iterative methods for solving unconstrained minimization problems, largely due to their favorable superlinear convergence property. However, existing results for these algorithms are limited as they provide either (i) a global convergence guarantee with an asymptotic superlinear convergence rate, or (ii) a local non-asymptotic superlinear rate for the case that the initial point and the initial Hessian approximation are chosen properly. In particular, no current analysis for quasi-Newton methods guarantees global convergence with an explicit superlinear convergence rate. In this paper, we close this gap and present the first globally convergent quasi-Newton method with an explicit non-asymptotic superlinear convergence rate. Unlike classical quasi-Newton methods, we build our algorithm upon the hybrid proximal extragradient method and propose a novel online learning framework for updating the Hessian approximation matrices. Specifically, guided by the convergence analysis, we formulate the Hessian approximation update as an online convex optimization problem in the space of matrices, and we relate the bounded regret of the online problem to the superlinear convergence of our method.Comment: 33 pages, 1 figure, accepted to COLT 202

arXiv.org e-Print Archive

Symmetric Rank- $k$ Methods

Author: Chen Cheng
Liu Chengchang
Luo Luo
Publication venue
Publication date: 28/03/2023
Field of study

This paper proposes a novel class of block quasi-Newton methods for convex optimization which we call symmetric rank-

k

(SR-

k

) methods. Each iteration of SR-

k

incorporates the curvature information with

k

Hessian-vector products achieved from the greedy or random strategy. We prove SR-

k

methods have the local superlinear convergence rate of

\mathcal{O}\big((1-k/d)^{t(t-1)/2}\big)

for minimizing smooth and strongly self-concordant function, where

d

is the problem dimension and

t

is the iteration counter. This is the first explicit superlinear convergence rate for block quasi-Newton methods and it successfully explains why block quasi-Newton methods converge faster than standard quasi-Newton methods in practice

arXiv.org e-Print Archive