Search CORE

25 research outputs found

Modelling Behavioural Diversity for Learning in Open-Ended Games

Author: Mguni David Henry
Nieves Nicolas Perez
Slumbers Oliver
Wang Jun
Wen Ying
Yang Yaodong
Publication venue
Publication date: 10/06/2021
Field of study

Promoting behavioural diversity is critical for solving games with non-transitive dynamics where strategic cycles exist, and there is no consistent winner (e.g., Rock-Paper-Scissors). Yet, there is a lack of rigorous treatment for defining diversity and constructing diversity-aware learning dynamics. In this work, we offer a geometric interpretation of behavioural diversity in games and introduce a novel diversity metric based on determinantal point processes (DPP). By incorporating the diversity metric into best-response dynamics, we develop diverse fictitious play and diverse policy-space response oracle for solving normal-form games and open-ended games. We prove the uniqueness of the diverse best response and the convergence of our algorithms on two-player games. Importantly, we show that maximising the DPP-based diversity metric guarantees to enlarge the gamescape -- convex polytopes spanned by agents' mixtures of strategies. To validate our diversity-aware solvers, we test on tens of games that show strong non-transitivity. Results suggest that our methods achieve at least the same, and in most games, lower exploitability than PSRO solvers by finding effective and diverse strategies.Comment: corresponds to <[email protected]

arXiv.org e-Print Archive

UCL Discovery

Diverse Auto-Curriculum is Critical for Successful Real-World Multiagent Learning Systems

Author: Ammar Haitham Bou
Graves Daniel
Luo Jun
Slumbers Oliver
Taylor Matthew E.
Wang Jun
Wen Ying
Yang Yaodong
Publication venue
Publication date: 16/02/2021
Field of study

Multiagent reinforcement learning (MARL) has achieved a remarkable amount of success in solving various types of video games. A cornerstone of this success is the auto-curriculum framework, which shapes the learning process by continually creating new challenging tasks for agents to adapt to, thereby facilitating the acquisition of new skills. In order to extend MARL methods to real-world domains outside of video games, we envision in this blue sky paper that maintaining a diversity-aware auto-curriculum is critical for successful MARL applications. Specifically, we argue that \emph{behavioural diversity} is a pivotal, yet under-explored, component for real-world multiagent learning systems, and that significant work remains in understanding how to design a diversity-aware auto-curriculum. We list four open challenges for auto-curriculum techniques, which we believe deserve more attention from this community. Towards validating our vision, we recommend modelling realistic interactive behaviours in autonomous driving as an important test bed, and recommend the SMARTS/ULTRA benchmark.Comment: AAMAS 202

arXiv.org e-Print Archive

UCL Discovery

Neural Auto-Curricula in Two-Player Zero-Sum Games

Author: Feng X
Liu B
McAleer S
Slumbers O
Wan Z
Wang J
Wen Y
Yang Y
Publication venue
Publication date: 01/01/2021
Field of study

When solving two-player zero-sum games, multi-agent reinforcement learning (MARL) algorithms often create populations of agents where, at each iteration, a new agent is discovered as the best response to a mixture over the opponent population. Within such a process, the update rules of "who to compete with" (i.e., the opponent mixture) and "how to beat them" (i.e., finding best responses) are underpinned by manually developed game theoretical principles such as fictitious play and Double Oracle. In this paper1, we introduce a novel framework-Neural Auto-Curricula (NAC)-that leverages meta-gradient descent to automate the discovery of the learning update rule without explicit human design. Specifically, we parameterise the opponent selection module by neural networks and the best-response module by optimisation subroutines, and update their parameters solely via interaction with the game engine, where both players aim to minimise their exploitability. Surprisingly, even without human design, the discovered MARL algorithms achieve competitive or even better performance with the state-of-the-art population-based game solvers (e.g., PSRO) on Games of Skill, differentiable Lotto, non-transitive Mixture Games, Iterated Matching Pennies, and Kuhn Poker. Additionally, we show that NAC is able to generalise from small games to large games, for example training on Kuhn Poker and outperforming PSRO on Leduc Poker. Our work inspires a promising future direction to discover general MARL algorithms solely from data

UCL Discovery

Paired comparisons for games of chance

Author: Cowan Alex
Publication venue
Publication date: 26/03/2023
Field of study

We present a Bayesian rating system based on the method of paired comparisons. Our system is a flexible generalization of the well-known Glicko, and in particular can better accommodate games with significant elements of luck. Our system is currently in use in the online game Duelyst II, and in that setting outperforms Glicko2

arXiv.org e-Print Archive

Developing, Evaluating and Scaling Learning Agents in Multi-Agent Environments

Author: Anthony Thomas
Bachrach Yoram
Bhoopchand Avishkar
Bullard Kalesha
Connor Jerome
Dasagi Vibhavari
De Vylder Bart
Duenez-Guzman Edgar
Elie Romuald
Everett Richard
Gemp Ian
Hennes Daniel
Hughes Edward
Khan Mina
Lanctot Marc
Larson Kate
Lever Guy
Liu Siqi
Marris Luke
McKee Kevin R.
Muller Paul
Perolat Julien
Strub Florian
Tacchetti Andrea
Tarassov Eugene
Tuyls Karl
Wang Zhe
Publication venue
Publication date: 22/09/2022
Field of study

The Game Theory & Multi-Agent team at DeepMind studies several aspects of multi-agent learning ranging from computing approximations to fundamental concepts in game theory to simulating social dilemmas in rich spatial environments and training 3-d humanoids in difficult team coordination tasks. A signature aim of our group is to use the resources and expertise made available to us at DeepMind in deep reinforcement learning to explore multi-agent systems in complex environments and use these benchmarks to advance our understanding. Here, we summarise the recent work of our team and present a taxonomy that we feel highlights many important open challenges in multi-agent research.Comment: Published in AI Communications 202

arXiv.org e-Print Archive

A Game-Theoretic Approach for Improving Generalization Ability of TSP Solvers

Author: Guo Tiande
Han Congying
Slumbers Oliver
Wang Chenguang
Wang Jun
Yang Yaodong
Zhang Haifeng
Publication venue
Publication date: 04/05/2022
Field of study

In this paper, we introduce a two-player zero-sum framework between a trainable \emph{Solver} and a \emph{Data Generator} to improve the generalization ability of deep learning-based solvers for Traveling Salesman Problem (TSP). Grounded in \textsl{Policy Space Response Oracle} (PSRO) methods, our two-player framework outputs a population of best-responding Solvers, over which we can mix and output a combined model that achieves the least exploitability against the Generator, and thereby the most generalizable performance on different TSP tasks. We conduct experiments on a variety of TSP instances with different types and sizes. Results suggest that our Solvers achieve the state-of-the-art performance even on tasks the Solver never meets, whilst the performance of other deep learning-based Solvers drops sharply due to over-fitting. To demonstrate the principle of our framework, we study the learning outcome of the proposed two-player game and demonstrate that the exploitability of the Solver population decreases during training, and it eventually approximates the Nash equilibrium along with the Generator.Comment: ICLR2022 Gamification and Multiagent Solutions Workshop Spotlight Presentatio

arXiv.org e-Print Archive