3 research outputs found
Stability of Gradient Learning Dynamics in Continuous Games: Vector Action Spaces
Towards characterizing the optimization landscape of games, this paper
analyzes the stability of gradient-based dynamics near fixed points of
two-player continuous games. We introduce the quadratic numerical range as a
method to characterize the spectrum of game dynamics and prove the robustness
of equilibria to variations in learning rates. By decomposing the game Jacobian
into symmetric and skew-symmetric components, we assess the contribution of a
vector field's potential and rotational components to the stability of
differential Nash equilibria. Our results show that in zero-sum games, all Nash
are stable and robust; in potential games, all stable points are Nash. For
general-sum games, we provide a sufficient condition for instability. We
conclude with a numerical example in which learning with timescale separation
results in faster convergence.Comment: extension of arXiv:2011.03650 to vector action spaces. Submitted to
IEEE L-CS
Gradient Descent-Ascent Provably Converges to Strict Local Minmax Equilibria with a Finite Timescale Separation
We study the role that a finite timescale separation parameter has on
gradient descent-ascent in two-player non-convex, non-concave zero-sum games
where the learning rate of player 1 is denoted by and the learning
rate of player 2 is defined to be . Existing work
analyzing the role of timescale separation in gradient descent-ascent has
primarily focused on the edge cases of players sharing a learning rate () and the maximizing player approximately converging between each update of
the minimizing player (). For the parameter choice of
, it is known that the learning dynamics are not guaranteed to converge
to a game-theoretically meaningful equilibria in general. In contrast, Jin et
al. (2020) showed that the stable critical points of gradient descent-ascent
coincide with the set of strict local minmax equilibria as
. In this work, we bridge the gap between past work by
showing there exists a finite timescale separation parameter such
that is a stable critical point of gradient descent-ascent for all
if and only if it is a strict local minmax
equilibrium. Moreover, we provide an explicit construction for computing
along with corresponding convergence rates and results under
deterministic and stochastic gradient feedback. The convergence results we
present are complemented by a non-convergence result: given a critical point
that is not a strict local minmax equilibrium, then there exists a
finite timescale separation such that is unstable for all
. Finally, we empirically demonstrate on the CIFAR-10
and CelebA datasets the significant impact timescale separation has on training
performance
An Overview of Multi-Agent Reinforcement Learning from Game Theoretical Perspective
Following the remarkable success of the AlphaGO series, 2019 was a booming
year that witnessed significant advances in multi-agent reinforcement learning
(MARL) techniques. MARL corresponds to the learning problem in a multi-agent
system in which multiple agents learn simultaneously. It is an
interdisciplinary domain with a long history that includes game theory, machine
learning, stochastic control, psychology, and optimisation. Although MARL has
achieved considerable empirical success in solving real-world games, there is a
lack of a self-contained overview in the literature that elaborates the game
theoretical foundations of modern MARL methods and summarises the recent
advances. In fact, the majority of existing surveys are outdated and do not
fully cover the recent developments since 2010. In this work, we provide a
monograph on MARL that covers both the fundamentals and the latest developments
in the research frontier. The goal of our monograph is to provide a
self-contained assessment of the current state-of-the-art MARL techniques from
a game theoretical perspective. We expect this work to serve as a stepping
stone for both new researchers who are about to enter this fast-growing domain
and existing domain experts who want to obtain a panoramic view and identify
new directions based on recent advances