Search CORE

2 research outputs found

Adaptive Step Sizes in Variance Reduction via Regularization

Author: Giannakis Georgios B.
Li Bingcong
Publication venue
Publication date: 15/10/2019
Field of study

The main goal of this work is equipping convex and nonconvex problems with Barzilai-Borwein (BB) step size. With the adaptivity of BB step sizes granted, they can fail when the objective function is not strongly convex. To overcome this challenge, the key idea here is to bridge (non)convex problems and strongly convex ones via regularization. The proposed regularization schemes are \textit{simple} yet effective. Wedding the BB step size with a variance reduction method, known as SARAH, offers a free lunch compared with vanilla SARAH in convex problems. The convergence of BB step sizes in nonconvex problems is also established and its complexity is no worse than other adaptive step sizes such as AdaGrad. As a byproduct, our regularized SARAH methods for convex functions ensure that the complexity to find

\mathbb{E}[\| \nabla f(\mathbf{x}) \|^2]\leq \epsilon

{\cal O}\big( (n+\frac{1}{\sqrt{\epsilon}})\ln{\frac{1}{\epsilon}}\big)

, improving

\epsilon

dependence over existing results. Numerical tests further validate the merits of proposed approaches

arXiv.org e-Print Archive

Almost Tune-Free Variance Reduction

Author: Giannakis Georgios B.
Li Bingcong
Wang Lingda
Publication venue
Publication date: 10/06/2020
Field of study

The variance reduction class of algorithms including the representative ones, SVRG and SARAH, have well documented merits for empirical risk minimization problems. However, they require grid search to tune parameters (step size and the number of iterations per inner loop) for optimal performance. This work introduces `almost tune-free' SVRG and SARAH schemes equipped with i) Barzilai-Borwein (BB) step sizes; ii) averaging; and, iii) the inner loop length adjusted to the BB step sizes. In particular, SVRG, SARAH, and their BB variants are first reexamined through an `estimate sequence' lens to enable new averaging methods that tighten their convergence rates theoretically, and improve their performance empirically when the step size or the inner loop length is chosen large. Then a simple yet effective means to adjust the number of iterations per inner loop is developed to enhance the merits of the proposed averaging schemes and BB step sizes. Numerical tests corroborate the proposed methods

arXiv.org e-Print Archive