Search CORE

1,859 research outputs found

The Mechanics of n-Player Differentiable Games

Author: Balduzzi David
Foerster Jakob
Graepel Thore
Martens James
Racaniere Sebastien
Tuyls Karl
Publication venue
Publication date: 01/01/2018
Field of study

The cornerstone underpinning deep learning is the guarantee that gradient descent on an objective converges to local minima. Unfortunately, this guarantee fails in settings, such as generative adversarial nets, where there are multiple interacting losses. The behavior of gradient-based methods in games is not well understood – and is becoming increasingly important as adversarial and multiobjective architectures proliferate. In this paper, we develop new techniques to understand and control the dynamics in general games. The key result is to decompose the second-order dynamics into two components. The first is related to potential games, which reduce to gradient descent on an implicit function; the second relates to Hamiltonian games, a new class of games that obey a conservation law, akin to conservation laws in classical mechanical systems. The decomposition motivates Symplectic Gradient Adjustment (SGA), a new algorithm for finding stable fixed points in general games. Basic experiments show SGA is competitive with recently proposed algorithms for finding stable fixed points in GANs – whilst at the same time being applicable to – and having guarantees in – much more general games

arXiv.org e-Print Archive

University of Liverpool Repository

UCL Discovery

Differentiable Game Mechanics

Author: Balduzzi David
Foerster Jakob
Graepel Thore
Letcher Alistair
Martens James
Racaniere Sebastien
Tuyls Karl
Publication venue
Publication date: 20/01/2019
Field of study

Deep learning is built on the foundational guarantee that gradient descent on an objective function converges to local minima. Unfortunately, this guarantee fails in settings, such as generative adversarial nets, that exhibit multiple interacting losses. The behavior of gradient-based methods in games is not well understood -- and is becoming increasingly important as adversarial and multi-objective architectures proliferate. In this paper, we develop new tools to understand and control the dynamics in n-player differentiable games. The key result is to decompose the game Jacobian into two components. The first, symmetric component, is related to potential games, which reduce to gradient descent on an implicit function. The second, antisymmetric component, relates to Hamiltonian games, a new class of games that obey a conservation law akin to conservation laws in classical mechanical systems. The decomposition motivates Symplectic Gradient Adjustment (SGA), a new algorithm for finding stable fixed points in differentiable games. Basic experiments show SGA is competitive with recently proposed algorithms for finding stable fixed points in GANs -- while at the same time being applicable to, and having guarantees in, much more general cases.Comment: JMLR 2019, journal version of arXiv:1802.0564

arXiv.org e-Print Archive

UCL Discovery

Deflation for semismooth equations

Author: Croci Matteo
Farrell Patrick E.
Surowiec Thomas M.
Publication venue
Publication date: 01/01/2019
Field of study

Variational inequalities can in general support distinct solutions. In this paper we study an algorithm for computing distinct solutions of a variational inequality, without varying the initial guess supplied to the solver. The central idea is the combination of a semismooth Newton method with a deflation operator that eliminates known solutions from consideration. Given one root of a semismooth residual, deflation constructs a new problem for which a semismooth Newton method will not converge to the known root, even from the same initial guess. This enables the discovery of other roots. We prove the effectiveness of the deflation technique under the same assumptions that guarantee locally superlinear convergence of a semismooth Newton method. We demonstrate its utility on various finite- and infinite-dimensional examples drawn from constrained optimization, game theory, economics and solid mechanics.Comment: 24 pages, 3 figure

arXiv.org e-Print Archive

Oxford University Research Archive

Competitive Gradient Descent

Author: Anandkumar Anima
Schäfer Florian
Publication venue
Publication date: 01/12/2019
Field of study

We introduce a new algorithm for the numerical computation of Nash equilibria of competitive two-player games. Our method is a natural generalization of gradient descent to the two-player setting where the update is given by the Nash equilibrium of a regularized bilinear local approximation of the underlying game. It avoids oscillatory and divergent behaviors seen in alternating gradient descent. Using numerical experiments and rigorous analysis, we provide a detailed comparison to methods based on \emph{optimism} and \emph{consensus} and show that our method avoids making any unnecessary changes to the gradient dynamics while achieving exponential (local) convergence for (locally) convex-concave zero sum games. Convergence and stability properties of our method are robust to strong interactions between the players, without adapting the stepsize, which is not the case with previous methods. In our numerical experiments on non-convex-concave problems, existing methods are prone to divergence and instability due to their sensitivity to interactions among the players, whereas we never observe divergence of our algorithm. The ability to choose larger stepsizes furthermore allows our algorithm to achieve faster convergence, as measured by the number of model evaluations.Comment: Appeared in NeurIPS 2019. This version corrects an error in theorem 2.2. Source code used for the numerical experiments can be found under http://github.com/f-t-s/CGD. A high-level overview of this work can be found under http://f-t-s.github.io/projects/cgd

arXiv.org e-Print Archive

Caltech Authors

Open-ended Learning in Symmetric Zero-sum Games

Author: Bachrach Yoram
Balduzzi David
Czarnecki Wojciech M.
Garnelo Marta
Graepel Thore
Jaderberg Max
Perolat Julien
Publication venue
Publication date: 01/01/2019
Field of study

Zero-sum games such as chess and poker are, abstractly, functions that evaluate pairs of agents, for example labeling them `winner' and `loser'. If the game is approximately transitive, then self-play generates sequences of agents of increasing strength. However, nontransitive games, such as rock-paper-scissors, can exhibit strategic cycles, and there is no longer a clear objective -- we want agents to increase in strength, but against whom is unclear. In this paper, we introduce a geometric framework for formulating agent objectives in zero-sum games, in order to construct adaptive sequences of objectives that yield open-ended learning. The framework allows us to reason about population performance in nontransitive games, and enables the development of a new algorithm (rectified Nash response, PSRO_rN) that uses game-theoretic niching to construct diverse populations of effective agents, producing a stronger set of agents than existing algorithms. We apply PSRO_rN to two highly nontransitive resource allocation games and find that PSRO_rN consistently outperforms the existing alternatives.Comment: ICML 2019, final versio

arXiv.org e-Print Archive

UCL Discovery

Dominant Strategies in Two Qubit Quantum Computations

Author: Khan Faisal Shah
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 04/02/2015
Field of study

Nash equilibrium is a solution concept in non-strictly competitive, non-cooperative game theory that finds applications in various scientific and engineering disciplines. A non-strictly competitive, non-cooperative game model is presented here for two qubit quantum computations that allows for the characterization of Nash equilibrium in these computations via the inner product of their state space. Nash equilibrium outcomes are optimal under given constraints and therefore offer a game-theoretic measure of constrained optimization of two qubit quantum computations.Comment: The abstract has been re-written and technical details added to section 5 in version

arXiv.org e-Print Archive

CiteSeerX