7,324 research outputs found

    A Generalised Method for Empirical Game Theoretic Analysis

    Get PDF
    This paper provides theoretical bounds for empirical game theoretical analysis of complex multi-agent interactions. We provide insights in the empirical meta game showing that a Nash equilibrium of the meta-game is an approximate Nash equilibrium of the true underlying game. We investigate and show how many data samples are required to obtain a close enough approximation of the underlying game. Additionally, we extend the meta-game analysis methodology to asymmetric games. The state-of-the-art has only considered empirical games in which agents have access to the same strategy sets and the payoff structure is symmetric, implying that agents are interchangeable. Finally, we carry out an empirical illustration of the generalised method in several domains, illustrating the theory and evolutionary dynamics of several versions of the AlphaGo algorithm (symmetric), the dynamics of the Colonel Blotto game played by human players on Facebook (symmetric), and an example of a meta-game in Leduc Poker (asymmetric), generated by the PSRO multi-agent learning algorithm.Comment: will appear at AAMAS'1

    On Similarities between Inference in Game Theory and Machine Learning

    No full text
    In this paper, we elucidate the equivalence between inference in game theory and machine learning. Our aim in so doing is to establish an equivalent vocabulary between the two domains so as to facilitate developments at the intersection of both fields, and as proof of the usefulness of this approach, we use recent developments in each field to make useful improvements to the other. More specifically, we consider the analogies between smooth best responses in fictitious play and Bayesian inference methods. Initially, we use these insights to develop and demonstrate an improved algorithm for learning in games based on probabilistic moderation. That is, by integrating over the distribution of opponent strategies (a Bayesian approach within machine learning) rather than taking a simple empirical average (the approach used in standard fictitious play) we derive a novel moderated fictitious play algorithm and show that it is more likely than standard fictitious play to converge to a payoff-dominant but risk-dominated Nash equilibrium in a simple coordination game. Furthermore we consider the converse case, and show how insights from game theory can be used to derive two improved mean field variational learning algorithms. We first show that the standard update rule of mean field variational learning is analogous to a Cournot adjustment within game theory. By analogy with fictitious play, we then suggest an improved update rule, and show that this results in fictitious variational play, an improved mean field variational learning algorithm that exhibits better convergence in highly or strongly connected graphical models. Second, we use a recent advance in fictitious play, namely dynamic fictitious play, to derive a derivative action variational learning algorithm, that exhibits superior convergence properties on a canonical machine learning problem (clustering a mixture distribution)

    Deep Reinforcement Learning from Self-Play in Imperfect-Information Games

    Get PDF
    Many real-world applications can be described as large-scale games of imperfect information. To deal with these challenging domains, prior work has focused on computing Nash equilibria in a handcrafted abstraction of the domain. In this paper we introduce the first scalable end-to-end approach to learning approximate Nash equilibria without prior domain knowledge. Our method combines fictitious self-play with deep reinforcement learning. When applied to Leduc poker, Neural Fictitious Self-Play (NFSP) approached a Nash equilibrium, whereas common reinforcement learning methods diverged. In Limit Texas Holdem, a poker game of real-world scale, NFSP learnt a strategy that approached the performance of state-of-the-art, superhuman algorithms based on significant domain expertise.Comment: updated version, incorporating conference feedbac

    Voting Power and Voting Blocs

    Get PDF
    We investigate the applicability of voting power indices, in particular the Penrose index (aka absolute Banzhaf index), in the analysis of voting blocs by means of a hypothetical voting body. We use the power of individual bloc members to study the implications of the formation of blocs and how voting power varies as bloc size varies. This technique of analysis has many real world applications to legislatures and international bodies. It can be generalised in many ways : the analysis is a priori (assuming formal voting and ignoring actual voting behaviour) but can be made empirical with voting data ; it examines the consequences of two blocs but can easily be extended to more.

    Information Causality, the Tsirelson Bound, and the 'Being-Thus' of Things

    Get PDF
    The principle of `information causality' can be used to derive an upper bound---known as the `Tsirelson bound'---on the strength of quantum mechanical correlations, and has been conjectured to be a foundational principle of nature. To date, however, it has not been sufficiently motivated to play such a foundational role. The motivations that have so far been given are, as I argue, either unsatisfactorily vague or appeal to little if anything more than intuition. Thus in this paper I consider whether some way might be found to successfully motivate the principle. And I propose that a compelling way of so doing is to understand it as a generalisation of Einstein's principle of the mutually independent existence---the `being-thus'---of spatially distant things. In particular I first describe an argument, due to Demopoulos, to the effect that the so-called `no-signalling' condition can be viewed as a generalisation of Einstein's principle that is appropriate for an irreducibly statistical theory such as quantum mechanics. I then argue that a compelling way to motivate information causality is to in turn consider it as a further generalisation of the Einsteinian principle that is appropriate for a theory of communication. I describe, however, some important conceptual obstacles that must yet be overcome if the project of establishing information causality as a foundational principle of nature is to succeed.Comment: '*' footnote added to page 1; 24 pages, 1 figure; Forthcoming in Studies in History and Philosophy of Modern Physic

    Open-ended Learning in Symmetric Zero-sum Games

    Get PDF
    Zero-sum games such as chess and poker are, abstractly, functions that evaluate pairs of agents, for example labeling them `winner' and `loser'. If the game is approximately transitive, then self-play generates sequences of agents of increasing strength. However, nontransitive games, such as rock-paper-scissors, can exhibit strategic cycles, and there is no longer a clear objective -- we want agents to increase in strength, but against whom is unclear. In this paper, we introduce a geometric framework for formulating agent objectives in zero-sum games, in order to construct adaptive sequences of objectives that yield open-ended learning. The framework allows us to reason about population performance in nontransitive games, and enables the development of a new algorithm (rectified Nash response, PSRO_rN) that uses game-theoretic niching to construct diverse populations of effective agents, producing a stronger set of agents than existing algorithms. We apply PSRO_rN to two highly nontransitive resource allocation games and find that PSRO_rN consistently outperforms the existing alternatives.Comment: ICML 2019, final versio
    • ā€¦
    corecore