Search CORE

7,324 research outputs found

A Generalised Method for Empirical Game Theoretic Analysis

Author: Graepel Thore
Lanctot Marc
Leibo Joel Z
Perolat Julien
Tuyls Karl
Publication venue
Publication date: 16/03/2018
Field of study

This paper provides theoretical bounds for empirical game theoretical analysis of complex multi-agent interactions. We provide insights in the empirical meta game showing that a Nash equilibrium of the meta-game is an approximate Nash equilibrium of the true underlying game. We investigate and show how many data samples are required to obtain a close enough approximation of the underlying game. Additionally, we extend the meta-game analysis methodology to asymmetric games. The state-of-the-art has only considered empirical games in which agents have access to the same strategy sets and the payoff structure is symmetric, implying that agents are interchangeable. Finally, we carry out an empirical illustration of the generalised method in several domains, illustrating the theory and evolutionary dynamics of several versions of the AlphaGo algorithm (symmetric), the dynamics of the Colonel Blotto game played by human players on Facebook (symmetric), and an example of a meta-game in Leduc Poker (asymmetric), generated by the PSRO multi-agent learning algorithm.Comment: will appear at AAMAS'1

arXiv.org e-Print Archive

UCL Discovery

On Similarities between Inference in Game Theory and Machine Learning

Author: Dash Rajdeep
Jennings Nick
Leslie D.
Reece S
Rezek I
Roberts S
Rogers Alex
Publication venue
Publication date: 01/01/2008
Field of study

In this paper, we elucidate the equivalence between inference in game theory and machine learning. Our aim in so doing is to establish an equivalent vocabulary between the two domains so as to facilitate developments at the intersection of both fields, and as proof of the usefulness of this approach, we use recent developments in each field to make useful improvements to the other. More specifically, we consider the analogies between smooth best responses in fictitious play and Bayesian inference methods. Initially, we use these insights to develop and demonstrate an improved algorithm for learning in games based on probabilistic moderation. That is, by integrating over the distribution of opponent strategies (a Bayesian approach within machine learning) rather than taking a simple empirical average (the approach used in standard fictitious play) we derive a novel moderated fictitious play algorithm and show that it is more likely than standard fictitious play to converge to a payoff-dominant but risk-dominated Nash equilibrium in a simple coordination game. Furthermore we consider the converse case, and show how insights from game theory can be used to derive two improved mean field variational learning algorithms. We first show that the standard update rule of mean field variational learning is analogous to a Cournot adjustment within game theory. By analogy with fictitious play, we then suggest an improved update rule, and show that this results in fictitious variational play, an improved mean field variational learning algorithm that exhibits better convergence in highly or strongly connected graphical models. Second, we use a recent advance in fictitious play, namely dynamic fictitious play, to derive a derivative action variational learning algorithm, that exhibits superior convergence properties on a canonical machine learning problem (clustering a mixture distribution)

CiteSeerX

Southampton (e-Prints Soton)

Oxford University Research Archive

Spiral - Imperial College Digital Repository

Lancaster E-Prints

Explore Bristol Research

Deep Reinforcement Learning from Self-Play in Imperfect-Information Games

Author: Heinrich Johannes
Silver David
Publication venue
Publication date: 03/03/2016
Field of study

Many real-world applications can be described as large-scale games of imperfect information. To deal with these challenging domains, prior work has focused on computing Nash equilibria in a handcrafted abstraction of the domain. In this paper we introduce the first scalable end-to-end approach to learning approximate Nash equilibria without prior domain knowledge. Our method combines fictitious self-play with deep reinforcement learning. When applied to Leduc poker, Neural Fictitious Self-Play (NFSP) approached a Nash equilibrium, whereas common reinforcement learning methods diverged. In Limit Texas Holdem, a poker game of real-world scale, NFSP learnt a strategy that approached the performance of state-of-the-art, superhuman algorithms based on significant domain expertise.Comment: updated version, incorporating conference feedbac

arXiv.org e-Print Archive

UCL Discovery

Voting Power and Voting Blocs

Author: Leech Dennis
Leech Robert
Publication venue
Publication date
Field of study

We investigate the applicability of voting power indices, in particular the Penrose index (aka absolute Banzhaf index), in the analysis of voting blocs by means of a hypothetical voting body. We use the power of individual bloc members to study the implications of the formation of blocs and how voting power varies as bloc size varies. This technique of analysis has many real world applications to legislatures and international bodies. It can be generalised in many ways : the analysis is a priori (assuming formal voting and ignoring actual voting behaviour) but can be made empirical with voting data ; it examines the consequences of two blocs but can easily be extended to more.

Research Papers in Economics

Information Causality, the Tsirelson Bound, and the 'Being-Thus' of Things

Author: Cuffaro Michael E.
Publication venue: 'Elsevier BV'
Publication date: 29/08/2017
Field of study

The principle of `information causality' can be used to derive an upper bound---known as the `Tsirelson bound'---on the strength of quantum mechanical correlations, and has been conjectured to be a foundational principle of nature. To date, however, it has not been sufficiently motivated to play such a foundational role. The motivations that have so far been given are, as I argue, either unsatisfactorily vague or appeal to little if anything more than intuition. Thus in this paper I consider whether some way might be found to successfully motivate the principle. And I propose that a compelling way of so doing is to understand it as a generalisation of Einstein's principle of the mutually independent existence---the `being-thus'---of spatially distant things. In particular I first describe an argument, due to Demopoulos, to the effect that the so-called `no-signalling' condition can be viewed as a generalisation of Einstein's principle that is appropriate for an irreducibly statistical theory such as quantum mechanics. I then argue that a compelling way to motivate information causality is to in turn consider it as a further generalisation of the Einsteinian principle that is appropriate for a theory of communication. I describe, however, some important conceptual obstacles that must yet be overcome if the project of establishing information causality as a foundational principle of nature is to succeed.Comment: '*' footnote added to page 1; 24 pages, 1 figure; Forthcoming in Studies in History and Philosophy of Modern Physic

arXiv.org e-Print Archive

PhilSci Archive

Open-ended Learning in Symmetric Zero-sum Games

Author: Bachrach Yoram
Balduzzi David
Czarnecki Wojciech M.
Garnelo Marta
Graepel Thore
Jaderberg Max
Perolat Julien
Publication venue
Publication date: 01/01/2019
Field of study

Zero-sum games such as chess and poker are, abstractly, functions that evaluate pairs of agents, for example labeling them `winner' and `loser'. If the game is approximately transitive, then self-play generates sequences of agents of increasing strength. However, nontransitive games, such as rock-paper-scissors, can exhibit strategic cycles, and there is no longer a clear objective -- we want agents to increase in strength, but against whom is unclear. In this paper, we introduce a geometric framework for formulating agent objectives in zero-sum games, in order to construct adaptive sequences of objectives that yield open-ended learning. The framework allows us to reason about population performance in nontransitive games, and enables the development of a new algorithm (rectified Nash response, PSRO_rN) that uses game-theoretic niching to construct diverse populations of effective agents, producing a stronger set of agents than existing algorithms. We apply PSRO_rN to two highly nontransitive resource allocation games and find that PSRO_rN consistently outperforms the existing alternatives.Comment: ICML 2019, final versio

arXiv.org e-Print Archive

UCL Discovery