362 research outputs found
On the Infimal Sub-differential Size of Primal-Dual Hybrid Gradient Method and Beyond
Primal-dual hybrid gradient method (PDHG, a.k.a. Chambolle and Pock method)
is a well-studied algorithm for minimax optimization problems with a bilinear
interaction term. Recently, PDHG is used as the base algorithm for a new LP
solver PDLP that aims to solve large LP instances by taking advantage of modern
computing resources, such as GPU and distributed system. Most of the previous
convergence results of PDHG are either on duality gap or on distance to the
optimal solution set, which are usually hard to compute during the solving
process. In this paper, we propose a new progress metric for analyzing PDHG,
which we dub infimal sub-differential size (IDS), by utilizing the geometry of
PDHG iterates. IDS is a natural extension of the gradient norm of smooth
problems to non-smooth problems, and it is tied with KKT error in the case of
LP. Compared to traditional progress metrics for PDHG, IDS always has a finite
value and can be computed only using information of the current solution. We
show that IDS monotonically decays, and it has an
sublinear rate for solving convex-concave primal-dual problems, and it has a
linear convergence rate if the problem further satisfies a regularity condition
that is satisfied by applications such as linear programming, quadratic
programming, TV-denoising model, etc. The simplicity of our analysis and the
monotonic decay of IDS suggest that IDS is a natural progress metric to analyze
PDHG. As a by-product of our analysis, we show that the primal-dual gap has
convergence rate for the last iteration of
PDHG for convex-concave problems. The analysis and results on PDHG can be
directly generalized to other primal-dual algorithms, for example, proximal
point method (PPM), alternating direction method of multipliers (ADMM) and
linearized alternating direction method of multipliers (l-ADMM)
Doubly Optimal No-Regret Learning in Monotone Games
We consider online learning in multi-player smooth monotone games. Existing
algorithms have limitations such as (1) being only applicable to strongly
monotone games; (2) lacking the no-regret guarantee; (3) having only asymptotic
or slow last-iterate convergence rate to a Nash
equilibrium. While the rate is tight for a large class
of algorithms including the well-studied extragradient algorithm and optimistic
gradient algorithm, it is not optimal for all gradient-based algorithms.
We propose the accelerated optimistic gradient (AOG) algorithm, the first
doubly optimal no-regret learning algorithm for smooth monotone games. Namely,
our algorithm achieves both (i) the optimal regret in the
adversarial setting under smooth and convex loss functions and (ii) the optimal
last-iterate convergence rate to a Nash equilibrium in
multi-player smooth monotone games. As a byproduct of the accelerated
last-iterate convergence rate, we further show that each player suffers only an
individual worst-case dynamic regret, providing an exponential
improvement over the previous state-of-the-art bound.Comment: Published at ICML 2023. V2 incorporates reviewers' feedbac
Semi-Anchored Multi-Step Gradient Descent Ascent Method for Structured Nonconvex-Nonconcave Composite Minimax Problems
Minimax problems, such as generative adversarial network, adversarial
training, and fair training, are widely solved by a multi-step gradient descent
ascent (MGDA) method in practice. However, its convergence guarantee is
limited. In this paper, inspired by the primal-dual hybrid gradient method, we
propose a new semi-anchoring (SA) technique for the MGDA method. This makes the
MGDA method find a stationary point of a structured nonconvex-nonconcave
composite minimax problem; its saddle-subdifferential operator satisfies the
weak Minty variational inequality condition. The resulting method, named
SA-MGDA, is built upon a Bregman proximal point method. We further develop its
backtracking line-search version, and its non-Euclidean version for smooth
adaptable functions. Numerical experiments, including a fair classification
training, are provided
On Stochastic Subgradient Mirror-Descent Algorithm with Weighted Averaging
This paper considers stochastic subgradient mirror-descent method for solving
constrained convex minimization problems. In particular, a stochastic
subgradient mirror-descent method with weighted iterate-averaging is
investigated and its per-iterate convergence rate is analyzed. The novel part
of the approach is in the choice of weights that are used to construct the
averages. Through the use of these weighted averages, we show that the known
optimal rates can be obtained with simpler algorithms than those currently
existing in the literature. Specifically, by suitably choosing the stepsize
values, one can obtain the rate of the order for strongly convex
functions, and the rate for general convex functions (not
necessarily differentiable). Furthermore, for the latter case, it is shown that
a stochastic subgradient mirror-descent with iterate averaging converges (along
a subsequence) to an optimal solution, almost surely, even with the stepsize of
the form , which was not previously known. The stepsize choices
that achieve the best rates are those proposed by Paul Tseng for acceleration
of proximal gradient methods
A Unified View of Large-scale Zero-sum Equilibrium Computation
The task of computing approximate Nash equilibria in large zero-sum
extensive-form games has received a tremendous amount of attention due mainly
to the Annual Computer Poker Competition. Immediately after its inception, two
competing and seemingly different approaches emerged---one an application of
no-regret online learning, the other a sophisticated gradient method applied to
a convex-concave saddle-point formulation. Since then, both approaches have
grown in relative isolation with advancements on one side not effecting the
other. In this paper, we rectify this by dissecting and, in a sense, unify the
two views.Comment: AAAI Workshop on Computer Poker and Imperfect Informatio
- β¦