2,081 research outputs found
Evolutionary Tournament-Based Comparison of Learning and Non-Learning Algorithms for Iterated Games
Evolutionary tournaments have been used effectively as a tool for comparing game-playing algorithms. For instance, in the late 1970's, Axelrod organized tournaments to compare algorithms for playing the iterated prisoner's dilemma (PD) game. These tournaments capture the dynamics in a population of agents that periodically adopt relatively successful algorithms in the environment. While these tournaments have provided us with a better understanding of the relative merits of algorithms for iterated PD, our understanding is less clear about algorithms for playing iterated versions of arbitrary single-stage games in an environment of heterogeneous agents. While the Nash equilibrium solution concept has been used to recommend using Nash equilibrium strategies for rational players playing general-sum games, learning algorithms like fictitious play may be preferred for playing against sub-rational players. In this paper, we study the relative performance of learning and non-learning algorithms in an evolutionary tournament where agents periodically adopt relatively successful algorithms in the population. The tournament is played over a testbed composed of all possible structurally distinct 2×2 conflicted games with ordinal payoffs: a baseline, neutral testbed for comparing algorithms. Before analyzing results from the evolutionary tournament, we discuss the testbed, our choice of representative learning and non-learning algorithms and relative rankings of these algorithms in a round-robin competition. The results from the tournament highlight the advantage of learning algorithms over players using static equilibrium strategies for repeated plays of arbitrary single-stage games. The results are likely to be of more benefit compared to work on static analysis of equilibrium strategies for choosing decision procedures for open, adapting agent society consisting of a variety of competitors.Repeated Games, Evolution, Simulation
Evolutionary instability of Zero Determinant strategies demonstrates that winning isn't everything
Zero Determinant (ZD) strategies are a new class of probabilistic and
conditional strategies that are able to unilaterally set the expected payoff of
an opponent in iterated plays of the Prisoner's Dilemma irrespective of the
opponent's strategy, or else to set the ratio between a ZD player's and their
opponent's expected payoff. Here we show that while ZD strategies are weakly
dominant, they are not evolutionarily stable and will instead evolve into less
coercive strategies. We show that ZD strategies with an informational advantage
over other players that allows them to recognize other ZD strategies can be
evolutionarily stable (and able to exploit other players). However, such an
advantage is bound to be short-lived as opposing strategies evolve to
counteract the recognition.Comment: 14 pages, 4 figures. Change in title (again!) to comply with Nature
Communications requirements. To appear in Nature Communication
Evolutionary games on graphs
Game theory is one of the key paradigms behind many scientific disciplines
from biology to behavioral sciences to economics. In its evolutionary form and
especially when the interacting agents are linked in a specific social network
the underlying solution concepts and methods are very similar to those applied
in non-equilibrium statistical physics. This review gives a tutorial-type
overview of the field for physicists. The first three sections introduce the
necessary background in classical and evolutionary game theory from the basic
definitions to the most important results. The fourth section surveys the
topological complications implied by non-mean-field-type social network
structures in general. The last three sections discuss in detail the dynamic
behavior of three prominent classes of models: the Prisoner's Dilemma, the
Rock-Scissors-Paper game, and Competing Associations. The major theme of the
review is in what sense and how the graph structure of interactions can modify
and enrich the picture of long term behavioral patterns emerging in
evolutionary games.Comment: Review, final version, 133 pages, 65 figure
Scale-free memory model for multiagent reinforcement learning. Mean field approximation and rock-paper-scissors dynamics
A continuous time model for multiagent systems governed by reinforcement
learning with scale-free memory is developed. The agents are assumed to act
independently of one another in optimizing their choice of possible actions via
trial-and-error search. To gain awareness about the action value the agents
accumulate in their memory the rewards obtained from taking a specific action
at each moment of time. The contribution of the rewards in the past to the
agent current perception of action value is described by an integral operator
with a power-law kernel. Finally a fractional differential equation governing
the system dynamics is obtained. The agents are considered to interact with one
another implicitly via the reward of one agent depending on the choice of the
other agents. The pairwise interaction model is adopted to describe this
effect. As a specific example of systems with non-transitive interactions, a
two agent and three agent systems of the rock-paper-scissors type are analyzed
in detail, including the stability analysis and numerical simulation.
Scale-free memory is demonstrated to cause complex dynamics of the systems at
hand. In particular, it is shown that there can be simultaneously two modes of
the system instability undergoing subcritical and supercritical bifurcation,
with the latter one exhibiting anomalous oscillations with the amplitude and
period growing with time. Besides, the instability onset via this supercritical
mode may be regarded as "altruism self-organization". For the three agent
system the instability dynamics is found to be rather irregular and can be
composed of alternate fragments of oscillations different in their properties.Comment: 17 pages, 7 figur
Stability and Diversity in Collective Adaptation
We derive a class of macroscopic differential equations that describe
collective adaptation, starting from a discrete-time stochastic microscopic
model. The behavior of each agent is a dynamic balance between adaptation that
locally achieves the best action and memory loss that leads to randomized
behavior. We show that, although individual agents interact with their
environment and other agents in a purely self-interested way, macroscopic
behavior can be interpreted as game dynamics. Application to several familiar,
explicit game interactions shows that the adaptation dynamics exhibits a
diversity of collective behaviors. The simplicity of the assumptions underlying
the macroscopic equations suggests that these behaviors should be expected
broadly in collective adaptation. We also analyze the adaptation dynamics from
an information-theoretic viewpoint and discuss self-organization induced by
information flux between agents, giving a novel view of collective adaptation.Comment: 22 pages, 23 figures; updated references, corrected typos, changed
conten
Multiple-Brain connectivity during third party punishment: an EEG hyperscanning study
Compassion is a particular form of empathic reaction to harm that befalls others and is accompanied by a desire to alleviate their suffering. This altruistic behavior is often manifested through altruistic punishment, wherein individuals penalize a deprecated human's actions, even if they are directed toward strangers. By adopting a dual approach, we provide empirical evidence that compassion is a multifaceted prosocial behavior and can predict altruistic punishment. In particular, in this multiple-brain connectivity study in an EEG hyperscanning setting, compassion was examined during real-time social interactions in a third-party punishment (TPP) experiment. We observed that specific connectivity patterns were linked to behavioral and psychological intra- and interpersonal factors. Thus, our results suggest that an ecological approach based on simultaneous dual-scanning and multiple-brain connectivity is suitable for analyzing complex social phenomena
Pricing routines and industrial dynamics
We propose an evolutionary model in which boundedly rational firms compete and learn in a dynamic oligopoly with imperfect information and evolving degrees of market power. Firms in the model set prices according to routines, and try to make profits by capturing market share. The model can be extended to deal with heterogeneous costs and technological advance. The demand side of the market is composed of boundedly rational consumers who are capable of adapting to changing market options. Supply-demand interactions can be represented through a population dynamics model from which prices and market structures emerge. We obtain closed-form and simulation results which we interpret and compare with benchmark results from a standard non-cooperative game (Bertrand). When we compare the results with the Bertrand setting, we find a surprising result. Whereas in the fully rational Bertrand setting, firms either lower prices and erode their extra profits, or try to cooperate in a collusive equilibrium that is detrimental for consumer welfare, in the evolutionary setting firms make substantial profits, compete by adjusting prices, and the dynamics improve consumer welfare. From these results we claim that, instead of treating market power, externalities, and asymmetric information as market failures, we should consider them as essential traits of market competition. We argue that neo-Schumpeterian models incorporate all of these features together, thus leading towards a more realistic price theory for market economies
- …