9 research outputs found
Payoff Performance of Fictitious Play
We investigate how well continuous-time fictitious play in two-player games
performs in terms of average payoff, particularly compared to Nash equilibrium
payoff. We show that in many games, fictitious play outperforms Nash
equilibrium on average or even at all times, and moreover that any game is
linearly equivalent to one in which this is the case. Conversely, we provide
conditions under which Nash equilibrium payoff dominates fictitious play
payoff. A key step in our analysis is to show that fictitious play dynamics
asymptotically converges the set of coarse correlated equilibria (a fact which
is implicit in the literature).Comment: 16 pages, 4 figure
Topics arising from fictitious play dynamics
In this thesis, we present a few different topics arising in the study of the learning dynamics
called fictitious play. We investigate the combinatorial properties of this dynamical system
describing the strategy sequences of the players, and in particular deduce a combinatorial
classification of zero-sum games with three strategies per player. We further obtain results
about the limit sets and asymptotic payoff performance of fictitious play as a learning
algorithm.
In order to study coexistence of regular (periodic and quasi-periodic) and chaotic
behaviour in fictitious play and a related continuous, piecewise affne flow on the threesphere,
we look at its planar first return maps and investigate several model problems for
such maps. We prove a non-recurrence result for non-self maps of regions in the plane,
similar to Brouwer’s classical result for planar homeomorphisms. Finally, we consider a
family of piecewise affne maps of the square, which is very similar to the first return maps
of fictitious play, but simple enough for explicit calculations, and prove several results about
its dynamics, particularly its invariant circles and regions
On the Convergence of Fictitious Play: A Decomposition Approach
Fictitious play (FP) is one of the most fundamental game-theoretical learning frameworks for computing Nash equilibrium in n-player games, which builds the foundation for modern multi-agent learning algorithms. Although FP has provable convergence guarantees on zero-sum games and potential games, many real-world problems are often a mixture of both and the convergence property of FP has not been fully studied yet. In this paper, we extend the convergence results of FP to the combinations of such games and beyond. Specifically, we derive new conditions for FP to converge by leveraging game decomposition techniques. We further develop a linear relationship unifying cooperation and competition in the sense that these two classes of games are mutually transferable. Finally, we analyze a non-convergent example of FP, the Shapley game, and develop sufficient conditions for FP to converge
Asymptotic Convergence and Performance of Multi-Agent Q-Learning Dynamics
Achieving convergence of multiple learning agents in general -player games
is imperative for the development of safe and reliable machine learning (ML)
algorithms and their application to autonomous systems. Yet it is known that,
outside the bounds of simple two-player games, convergence cannot be taken for
granted.
To make progress in resolving this problem, we study the dynamics of smooth
Q-Learning, a popular reinforcement learning algorithm which quantifies the
tendency for learning agents to explore their state space or exploit their
payoffs. We show a sufficient condition on the rate of exploration such that
the Q-Learning dynamics is guaranteed to converge to a unique equilibrium in
any game. We connect this result to games for which Q-Learning is known to
converge with arbitrary exploration rates, including weighted Potential games
and weighted zero sum polymatrix games.
Finally, we examine the performance of the Q-Learning dynamic as measured by
the Time Averaged Social Welfare, and comparing this with the Social Welfare
achieved by the equilibrium. We provide a sufficient condition whereby the
Q-Learning dynamic will outperform the equilibrium even if the dynamics do not
converge.Comment: Accepted in AAMAS 202
Robustness Properties in Fictitious-Play-Type Algorithms
Fictitious play (FP) is a canonical game-theoretic learning algorithm which has been deployed extensively in decentralized control scenarios. However standard treatments of FP, and of many other game-theoretic models, assume rather idealistic conditions which rarely hold in realistic control scenarios. This paper considers a broad class of best response learning algorithms, that we refer to as FP-type algorithms. In such an algorithm, given some (possibly limited) information about the history of actions, each individual forecasts the future play and chooses a (myopic) best action given their forecast. We provide a unifed analysis of the behavior of FP-type algorithms under an important class of perturbations, thus demonstrating robustness to deviations from the idealistic operating conditions that have been previously assumed. This robustness result is then used to derive convergence results for two control-relevant relaxations of standard game-theoretic applications: distributed (network-based) implementation without full observability and asynchronous deployment (including in continuous time). In each case the results follow as a direct consequence of the main robustness result