21,930 research outputs found
Collective states in social systems with interacting learning agents
We consider a social system of interacting heterogeneous agents with learning
abilities, a model close to Random Field Ising Models, where the random field
corresponds to the idiosyncratic willingness to pay. Given a fixed price,
agents decide repeatedly whether to buy or not a unit of a good, so as to
maximize their expected utilities. We show that the equilibrium reached by the
system depends on the nature of the information agents use to estimate their
expected utilities.Comment: 18 pages, 26 figure
Dynamical selection of Nash equilibria using Experience Weighted Attraction Learning: emergence of heterogeneous mixed equilibria
We study the distribution of strategies in a large game that models how
agents choose among different double auction markets. We classify the possible
mean field Nash equilibria, which include potentially segregated states where
an agent population can split into subpopulations adopting different
strategies. As the game is aggregative, the actual equilibrium strategy
distributions remain undetermined, however. We therefore compare with the
results of Experience-Weighted Attraction (EWA) learning, which at long times
leads to Nash equilibria in the appropriate limits of large intensity of
choice, low noise (long agent memory) and perfect imputation of missing scores
(fictitious play). The learning dynamics breaks the indeterminacy of the Nash
equilibria. Non-trivially, depending on how the relevant limits are taken, more
than one type of equilibrium can be selected. These include the standard
homogeneous mixed and heterogeneous pure states, but also \emph{heterogeneous
mixed} states where different agents play different strategies that are not all
pure. The analysis of the EWA learning involves Fokker-Planck modeling combined
with large deviation methods. The theoretical results are confirmed by
multi-agent simulations.Comment: 35 pages, 16 figure
Open-ended Learning in Symmetric Zero-sum Games
Zero-sum games such as chess and poker are, abstractly, functions that
evaluate pairs of agents, for example labeling them `winner' and `loser'. If
the game is approximately transitive, then self-play generates sequences of
agents of increasing strength. However, nontransitive games, such as
rock-paper-scissors, can exhibit strategic cycles, and there is no longer a
clear objective -- we want agents to increase in strength, but against whom is
unclear. In this paper, we introduce a geometric framework for formulating
agent objectives in zero-sum games, in order to construct adaptive sequences of
objectives that yield open-ended learning. The framework allows us to reason
about population performance in nontransitive games, and enables the
development of a new algorithm (rectified Nash response, PSRO_rN) that uses
game-theoretic niching to construct diverse populations of effective agents,
producing a stronger set of agents than existing algorithms. We apply PSRO_rN
to two highly nontransitive resource allocation games and find that PSRO_rN
consistently outperforms the existing alternatives.Comment: ICML 2019, final versio
Evolutionary Tournament-Based Comparison of Learning and Non-Learning Algorithms for Iterated Games
Evolutionary tournaments have been used effectively as a tool for comparing game-playing algorithms. For instance, in the late 1970's, Axelrod organized tournaments to compare algorithms for playing the iterated prisoner's dilemma (PD) game. These tournaments capture the dynamics in a population of agents that periodically adopt relatively successful algorithms in the environment. While these tournaments have provided us with a better understanding of the relative merits of algorithms for iterated PD, our understanding is less clear about algorithms for playing iterated versions of arbitrary single-stage games in an environment of heterogeneous agents. While the Nash equilibrium solution concept has been used to recommend using Nash equilibrium strategies for rational players playing general-sum games, learning algorithms like fictitious play may be preferred for playing against sub-rational players. In this paper, we study the relative performance of learning and non-learning algorithms in an evolutionary tournament where agents periodically adopt relatively successful algorithms in the population. The tournament is played over a testbed composed of all possible structurally distinct 2Ă2 conflicted games with ordinal payoffs: a baseline, neutral testbed for comparing algorithms. Before analyzing results from the evolutionary tournament, we discuss the testbed, our choice of representative learning and non-learning algorithms and relative rankings of these algorithms in a round-robin competition. The results from the tournament highlight the advantage of learning algorithms over players using static equilibrium strategies for repeated plays of arbitrary single-stage games. The results are likely to be of more benefit compared to work on static analysis of equilibrium strategies for choosing decision procedures for open, adapting agent society consisting of a variety of competitors.Repeated Games, Evolution, Simulation
- âŠ