2,081 research outputs found

    Evolutionary Tournament-Based Comparison of Learning and Non-Learning Algorithms for Iterated Games

    Get PDF
    Evolutionary tournaments have been used effectively as a tool for comparing game-playing algorithms. For instance, in the late 1970's, Axelrod organized tournaments to compare algorithms for playing the iterated prisoner's dilemma (PD) game. These tournaments capture the dynamics in a population of agents that periodically adopt relatively successful algorithms in the environment. While these tournaments have provided us with a better understanding of the relative merits of algorithms for iterated PD, our understanding is less clear about algorithms for playing iterated versions of arbitrary single-stage games in an environment of heterogeneous agents. While the Nash equilibrium solution concept has been used to recommend using Nash equilibrium strategies for rational players playing general-sum games, learning algorithms like fictitious play may be preferred for playing against sub-rational players. In this paper, we study the relative performance of learning and non-learning algorithms in an evolutionary tournament where agents periodically adopt relatively successful algorithms in the population. The tournament is played over a testbed composed of all possible structurally distinct 2×2 conflicted games with ordinal payoffs: a baseline, neutral testbed for comparing algorithms. Before analyzing results from the evolutionary tournament, we discuss the testbed, our choice of representative learning and non-learning algorithms and relative rankings of these algorithms in a round-robin competition. The results from the tournament highlight the advantage of learning algorithms over players using static equilibrium strategies for repeated plays of arbitrary single-stage games. The results are likely to be of more benefit compared to work on static analysis of equilibrium strategies for choosing decision procedures for open, adapting agent society consisting of a variety of competitors.Repeated Games, Evolution, Simulation

    Evolutionary instability of Zero Determinant strategies demonstrates that winning isn't everything

    Get PDF
    Zero Determinant (ZD) strategies are a new class of probabilistic and conditional strategies that are able to unilaterally set the expected payoff of an opponent in iterated plays of the Prisoner's Dilemma irrespective of the opponent's strategy, or else to set the ratio between a ZD player's and their opponent's expected payoff. Here we show that while ZD strategies are weakly dominant, they are not evolutionarily stable and will instead evolve into less coercive strategies. We show that ZD strategies with an informational advantage over other players that allows them to recognize other ZD strategies can be evolutionarily stable (and able to exploit other players). However, such an advantage is bound to be short-lived as opposing strategies evolve to counteract the recognition.Comment: 14 pages, 4 figures. Change in title (again!) to comply with Nature Communications requirements. To appear in Nature Communication

    Evolutionary games on graphs

    Full text link
    Game theory is one of the key paradigms behind many scientific disciplines from biology to behavioral sciences to economics. In its evolutionary form and especially when the interacting agents are linked in a specific social network the underlying solution concepts and methods are very similar to those applied in non-equilibrium statistical physics. This review gives a tutorial-type overview of the field for physicists. The first three sections introduce the necessary background in classical and evolutionary game theory from the basic definitions to the most important results. The fourth section surveys the topological complications implied by non-mean-field-type social network structures in general. The last three sections discuss in detail the dynamic behavior of three prominent classes of models: the Prisoner's Dilemma, the Rock-Scissors-Paper game, and Competing Associations. The major theme of the review is in what sense and how the graph structure of interactions can modify and enrich the picture of long term behavioral patterns emerging in evolutionary games.Comment: Review, final version, 133 pages, 65 figure

    Scale-free memory model for multiagent reinforcement learning. Mean field approximation and rock-paper-scissors dynamics

    Full text link
    A continuous time model for multiagent systems governed by reinforcement learning with scale-free memory is developed. The agents are assumed to act independently of one another in optimizing their choice of possible actions via trial-and-error search. To gain awareness about the action value the agents accumulate in their memory the rewards obtained from taking a specific action at each moment of time. The contribution of the rewards in the past to the agent current perception of action value is described by an integral operator with a power-law kernel. Finally a fractional differential equation governing the system dynamics is obtained. The agents are considered to interact with one another implicitly via the reward of one agent depending on the choice of the other agents. The pairwise interaction model is adopted to describe this effect. As a specific example of systems with non-transitive interactions, a two agent and three agent systems of the rock-paper-scissors type are analyzed in detail, including the stability analysis and numerical simulation. Scale-free memory is demonstrated to cause complex dynamics of the systems at hand. In particular, it is shown that there can be simultaneously two modes of the system instability undergoing subcritical and supercritical bifurcation, with the latter one exhibiting anomalous oscillations with the amplitude and period growing with time. Besides, the instability onset via this supercritical mode may be regarded as "altruism self-organization". For the three agent system the instability dynamics is found to be rather irregular and can be composed of alternate fragments of oscillations different in their properties.Comment: 17 pages, 7 figur

    Stability and Diversity in Collective Adaptation

    Get PDF
    We derive a class of macroscopic differential equations that describe collective adaptation, starting from a discrete-time stochastic microscopic model. The behavior of each agent is a dynamic balance between adaptation that locally achieves the best action and memory loss that leads to randomized behavior. We show that, although individual agents interact with their environment and other agents in a purely self-interested way, macroscopic behavior can be interpreted as game dynamics. Application to several familiar, explicit game interactions shows that the adaptation dynamics exhibits a diversity of collective behaviors. The simplicity of the assumptions underlying the macroscopic equations suggests that these behaviors should be expected broadly in collective adaptation. We also analyze the adaptation dynamics from an information-theoretic viewpoint and discuss self-organization induced by information flux between agents, giving a novel view of collective adaptation.Comment: 22 pages, 23 figures; updated references, corrected typos, changed conten

    Multiple-Brain connectivity during third party punishment: an EEG hyperscanning study

    Get PDF
    Compassion is a particular form of empathic reaction to harm that befalls others and is accompanied by a desire to alleviate their suffering. This altruistic behavior is often manifested through altruistic punishment, wherein individuals penalize a deprecated human's actions, even if they are directed toward strangers. By adopting a dual approach, we provide empirical evidence that compassion is a multifaceted prosocial behavior and can predict altruistic punishment. In particular, in this multiple-brain connectivity study in an EEG hyperscanning setting, compassion was examined during real-time social interactions in a third-party punishment (TPP) experiment. We observed that specific connectivity patterns were linked to behavioral and psychological intra- and interpersonal factors. Thus, our results suggest that an ecological approach based on simultaneous dual-scanning and multiple-brain connectivity is suitable for analyzing complex social phenomena

    Pricing routines and industrial dynamics

    Get PDF
    We propose an evolutionary model in which boundedly rational firms compete and learn in a dynamic oligopoly with imperfect information and evolving degrees of market power. Firms in the model set prices according to routines, and try to make profits by capturing market share. The model can be extended to deal with heterogeneous costs and technological advance. The demand side of the market is composed of boundedly rational consumers who are capable of adapting to changing market options. Supply-demand interactions can be represented through a population dynamics model from which prices and market structures emerge. We obtain closed-form and simulation results which we interpret and compare with benchmark results from a standard non-cooperative game (Bertrand). When we compare the results with the Bertrand setting, we find a surprising result. Whereas in the fully rational Bertrand setting, firms either lower prices and erode their extra profits, or try to cooperate in a collusive equilibrium that is detrimental for consumer welfare, in the evolutionary setting firms make substantial profits, compete by adjusting prices, and the dynamics improve consumer welfare. From these results we claim that, instead of treating market power, externalities, and asymmetric information as market failures, we should consider them as essential traits of market competition. We argue that neo-Schumpeterian models incorporate all of these features together, thus leading towards a more realistic price theory for market economies