2 research outputs found
Adaptive Step Sizes in Variance Reduction via Regularization
The main goal of this work is equipping convex and nonconvex problems with
Barzilai-Borwein (BB) step size. With the adaptivity of BB step sizes granted,
they can fail when the objective function is not strongly convex. To overcome
this challenge, the key idea here is to bridge (non)convex problems and
strongly convex ones via regularization. The proposed regularization schemes
are \textit{simple} yet effective. Wedding the BB step size with a variance
reduction method, known as SARAH, offers a free lunch compared with vanilla
SARAH in convex problems. The convergence of BB step sizes in nonconvex
problems is also established and its complexity is no worse than other adaptive
step sizes such as AdaGrad. As a byproduct, our regularized SARAH methods for
convex functions ensure that the complexity to find is , improving
dependence over existing results. Numerical tests further validate
the merits of proposed approaches
Almost Tune-Free Variance Reduction
The variance reduction class of algorithms including the representative ones,
SVRG and SARAH, have well documented merits for empirical risk minimization
problems. However, they require grid search to tune parameters (step size and
the number of iterations per inner loop) for optimal performance. This work
introduces `almost tune-free' SVRG and SARAH schemes equipped with i)
Barzilai-Borwein (BB) step sizes; ii) averaging; and, iii) the inner loop
length adjusted to the BB step sizes. In particular, SVRG, SARAH, and their BB
variants are first reexamined through an `estimate sequence' lens to enable new
averaging methods that tighten their convergence rates theoretically, and
improve their performance empirically when the step size or the inner loop
length is chosen large. Then a simple yet effective means to adjust the number
of iterations per inner loop is developed to enhance the merits of the proposed
averaging schemes and BB step sizes. Numerical tests corroborate the proposed
methods