The optimistic gradient method is useful in addressing minimax optimization
problems. Motivated by the observation that the conventional stochastic version
suffers from the need for a large batch size on the order of
O(ε−2) to achieve an ε-stationary
solution, we introduce and analyze a new formulation termed Diffusion
Stochastic Same-Sample Optimistic Gradient (DSS-OG). We prove its convergence
and resolve the large batch issue by establishing a tighter upper bound, under
the more general setting of nonconvex Polyak-Lojasiewicz (PL) risk functions.
We also extend the applicability of the proposed method to the distributed
scenario, where agents communicate with their neighbors via a left-stochastic
protocol. To implement DSS-OG, we can query the stochastic gradient oracles in
parallel with some extra memory overhead, resulting in a complexity comparable
to its conventional counterpart. To demonstrate the efficacy of the proposed
algorithm, we conduct tests by training generative adversarial networks