Search CORE

9 research outputs found

Payoff Performance of Fictitious Play

Author: Ostrovski Georg
van Strien Sebastian
Publication venue
Publication date: 18/11/2014
Field of study

We investigate how well continuous-time fictitious play in two-player games performs in terms of average payoff, particularly compared to Nash equilibrium payoff. We show that in many games, fictitious play outperforms Nash equilibrium on average or even at all times, and moreover that any game is linearly equivalent to one in which this is the case. Conversely, we provide conditions under which Nash equilibrium payoff dominates fictitious play payoff. A key step in our analysis is to show that fictitious play dynamics asymptotically converges the set of coarse correlated equilibria (a fact which is implicit in the literature).Comment: 16 pages, 4 figure

arXiv.org e-Print Archive

Spiral - Imperial College Digital Repository

Topics arising from fictitious play dynamics

Author: Ostrovski Georg
Publication venue
Publication date
Field of study

In this thesis, we present a few different topics arising in the study of the learning dynamics called fictitious play. We investigate the combinatorial properties of this dynamical system describing the strategy sequences of the players, and in particular deduce a combinatorial classification of zero-sum games with three strategies per player. We further obtain results about the limit sets and asymptotic payoff performance of fictitious play as a learning algorithm. In order to study coexistence of regular (periodic and quasi-periodic) and chaotic behaviour in fictitious play and a related continuous, piecewise affne flow on the threesphere, we look at its planar first return maps and investigate several model problems for such maps. We prove a non-recurrence result for non-self maps of regions in the plane, similar to Brouwer’s classical result for planar homeomorphisms. Finally, we consider a family of piecewise affne maps of the square, which is very similar to the first return maps of fictitious play, but simple enough for explicit calculations, and prove several results about its dynamics, particularly its invariant circles and regions

Warwick Research Archives Portal Repository

On the Convergence of Fictitious Play: A Decomposition Approach

Author: Chen Y
Deng X
Li C
Mguni D
Wang J
Yan X
Yang Y
Publication venue: IJCAI: International Joint Conferences on Artificial Intelligence Organization
Publication date: 29/07/2022
Field of study

Fictitious play (FP) is one of the most fundamental game-theoretical learning frameworks for computing Nash equilibrium in n-player games, which builds the foundation for modern multi-agent learning algorithms. Although FP has provable convergence guarantees on zero-sum games and potential games, many real-world problems are often a mixture of both and the convergence property of FP has not been fully studied yet. In this paper, we extend the convergence results of FP to the combinations of such games and beyond. Specifically, we derive new conditions for FP to converge by leveraging game decomposition techniques. We further develop a linear relationship unifying cooperation and competition in the sense that these two classes of games are mutually transferable. Finally, we analyze a non-convergent example of FP, the Shapley game, and develop sufficient conditions for FP to converge

UCL Discovery

Asymptotic Convergence and Performance of Multi-Agent Q-Learning Dynamics

Author: Belardinelli Francesco
Hussain Aamal Abbas
Piliouras Georgios
Publication venue
Publication date: 23/01/2023
Field of study

Achieving convergence of multiple learning agents in general

N

-player games is imperative for the development of safe and reliable machine learning (ML) algorithms and their application to autonomous systems. Yet it is known that, outside the bounds of simple two-player games, convergence cannot be taken for granted. To make progress in resolving this problem, we study the dynamics of smooth Q-Learning, a popular reinforcement learning algorithm which quantifies the tendency for learning agents to explore their state space or exploit their payoffs. We show a sufficient condition on the rate of exploration such that the Q-Learning dynamics is guaranteed to converge to a unique equilibrium in any game. We connect this result to games for which Q-Learning is known to converge with arbitrary exploration rates, including weighted Potential games and weighted zero sum polymatrix games. Finally, we examine the performance of the Q-Learning dynamic as measured by the Time Averaged Social Welfare, and comparing this with the Social Welfare achieved by the equilibrium. We provide a sufficient condition whereby the Q-Learning dynamic will outperform the equilibrium even if the dynamics do not converge.Comment: Accepted in AAMAS 202

arXiv.org e-Print Archive

Robustness Properties in Fictitious-Play-Type Algorithms

Author: Kar Soummya
Leslie David S.
Swenson Brian
Xavier João
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 28/09/2016
Field of study

Fictitious play (FP) is a canonical game-theoretic learning algorithm which has been deployed extensively in decentralized control scenarios. However standard treatments of FP, and of many other game-theoretic models, assume rather idealistic conditions which rarely hold in realistic control scenarios. This paper considers a broad class of best response learning algorithms, that we refer to as FP-type algorithms. In such an algorithm, given some (possibly limited) information about the history of actions, each individual forecasts the future play and chooses a (myopic) best action given their forecast. We provide a unifed analysis of the behavior of FP-type algorithms under an important class of perturbations, thus demonstrating robustness to deviations from the idealistic operating conditions that have been previously assumed. This robustness result is then used to derive convergence results for two control-relevant relaxations of standard game-theoretic applications: distributed (network-based) implementation without full observability and asynchronous deployment (including in continuous time). In each case the results follow as a direct consequence of the main robustness result

arXiv.org e-Print Archive

Crossref

Lancaster E-Prints

Intelligent cognitive radio jamming - a game-theoretical approach

Author: A Garnaev
B Banerjee
B Von Stengel
B Wang
BV Stengel
C Chen
C Daskalakis
C Daskalakis
D Cabric
E Altman
G Ostrovski
G Owen
I Milchtaich
J Blesa
J Nash
J Robinson
J Zhu
JF Mertens
JS Shamma
K Dabcevic
K Dabcevic
K Dabcevic
K Dabcevic
K Wang
L Yang
LEJ Brouwer
LS Shapley
M Nekovee
M Tokic
MO Mughal
N Buchbinder
P Anand
P Morerio
R Cominetti
R Poisel
RD McKelvey
RD Pietro
RS Sutton
S Kapoor
SelexES
V Conitzer
W Wang
Y Zeng
Z Khalaf
Z Lei
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref