1,473 research outputs found
Unified analysis of SGD-type methods
This note focuses on a simple approach to the unified analysis of SGD-type
methods from (Gorbunov et al., 2020) for strongly convex smooth optimization
problems. The similarities in the analyses of different stochastic first-order
methods are discussed along with the existing extensions of the framework. The
limitations of the analysis and several alternative approaches are mentioned as
well.Comment: Part of the Encyclopedia of Optimization. 8 page
Single-Call Stochastic Extragradient Methods for Structured Non-monotone Variational Inequalities: Improved Analysis under Weaker Conditions
Single-call stochastic extragradient methods, like stochastic past
extragradient (SPEG) and stochastic optimistic gradient (SOG), have gained a
lot of interest in recent years and are one of the most efficient algorithms
for solving large-scale min-max optimization and variational inequalities
problems (VIP) appearing in various machine learning tasks. However, despite
their undoubted popularity, current convergence analyses of SPEG and SOG
require a bounded variance assumption. In addition, several important questions
regarding the convergence properties of these methods are still open, including
mini-batching, efficient step-size selection, and convergence guarantees under
different sampling strategies. In this work, we address these questions and
provide convergence guarantees for two large classes of structured non-monotone
VIPs: (i) quasi-strongly monotone problems (a generalization of strongly
monotone problems) and (ii) weak Minty variational inequalities (a
generalization of monotone and Minty VIPs). We introduce the expected residual
condition, explain its benefits, and show how it can be used to obtain a
strictly weaker bound than previously used growth conditions, expected
co-coercivity, or bounded variance assumptions. Equipped with this condition,
we provide theoretical guarantees for the convergence of single-call
extragradient methods for different step-size selections, including constant,
decreasing, and step-size-switching rules. Furthermore, our convergence
analysis holds under the arbitrary sampling paradigm, which includes importance
sampling and various mini-batching strategies as special cases.Comment: 37th Conference on Neural Information Processing Systems (NeurIPS
2023
- …