58 research outputs found

    Evolutionary Tournament-Based Comparison of Learning and Non-Learning Algorithms for Iterated Games

    Get PDF
    Evolutionary tournaments have been used effectively as a tool for comparing game-playing algorithms. For instance, in the late 1970's, Axelrod organized tournaments to compare algorithms for playing the iterated prisoner's dilemma (PD) game. These tournaments capture the dynamics in a population of agents that periodically adopt relatively successful algorithms in the environment. While these tournaments have provided us with a better understanding of the relative merits of algorithms for iterated PD, our understanding is less clear about algorithms for playing iterated versions of arbitrary single-stage games in an environment of heterogeneous agents. While the Nash equilibrium solution concept has been used to recommend using Nash equilibrium strategies for rational players playing general-sum games, learning algorithms like fictitious play may be preferred for playing against sub-rational players. In this paper, we study the relative performance of learning and non-learning algorithms in an evolutionary tournament where agents periodically adopt relatively successful algorithms in the population. The tournament is played over a testbed composed of all possible structurally distinct 2×2 conflicted games with ordinal payoffs: a baseline, neutral testbed for comparing algorithms. Before analyzing results from the evolutionary tournament, we discuss the testbed, our choice of representative learning and non-learning algorithms and relative rankings of these algorithms in a round-robin competition. The results from the tournament highlight the advantage of learning algorithms over players using static equilibrium strategies for repeated plays of arbitrary single-stage games. The results are likely to be of more benefit compared to work on static analysis of equilibrium strategies for choosing decision procedures for open, adapting agent society consisting of a variety of competitors.Repeated Games, Evolution, Simulation

    Être impulsif rend moins altruiste : une expĂ©rience avec les diamants mandarins

    Full text link
    L’altruisme rĂ©ciproque, le mĂ©canisme le plus vraisemblable expliquant l’existence de la coopĂ©ration entre individus non-apparentĂ©s, peut ĂȘtre modĂ©lisĂ© par le Dilemme du Prisonnier. Ce jeu prĂ©dit que la coopĂ©ration devrait Ă©voluer lorsque les joueurs prĂ©voient d’interagir ensemble Ă  maintes reprises et adoptent des stratĂ©gies conditionnelles telles que «Tit-For-Tat» ou Pavlov. Bien que la coopĂ©ration soit Ă  la source de toutes sociĂ©tĂ©s humaines, celle-ci est rarement observĂ©e chez les animaux. Une explication plausible serait que ces derniers sont plus impulsifs que les humains. Plusieurs Ă©tudes ayant Ă©valuĂ© les effets de l’impulsivitĂ© sur la coopĂ©ration ont en effet trouvĂ© un impact nĂ©gatif du phĂ©nomĂšne de « discounting » sur la rĂ©ciprocitĂ©. NĂ©anmoins, l’impulsivitĂ© n’est pas un concept unitaire et le rĂŽle de l’impulsivitĂ© motrice, une autre facette de l’impulsivitĂ©, reste inexplorĂ©, alors qu’elle pourrait Ă©galement restreindre la coopĂ©ration en altĂ©rant la capacitĂ© des individus Ă  ajuster de maniĂšre flexible leur comportement face aux dĂ©cisions prises par leur partenaire. En effet, l’impulsivitĂ© motrice se dĂ©finit comme Ă©tant l’incapacitĂ© Ă  inhiber un comportement qui n’est plus appropriĂ© suite Ă  un changement de situation et est donc contreproductif (Broos et al., 2012; MacLean et al., 2014). Pour rĂ©soudre cette hypothĂšse, nous avons menĂ© une expĂ©rience avec des diamants mandarins (Taenyopigia guttata) que nous avons appariĂ©s en fonction de leur niveau d’impulsivitĂ© motrice, puis nous les avons fait jouer dans un Dilemme du Prisonnier AlternĂ©. Tel qu’attendu, nous avons trouvĂ© que la coopĂ©ration mutuelle survenait plus frĂ©quemment entre les partenaires autocontrĂŽlĂ©s que les paires d’individus impulsifs, ce qui serait dĂ» Ă  une diffĂ©rence entre les stratĂ©gies employĂ©es par les deux types d’individus. Plus prĂ©cisĂ©ment, les individus autocontrĂŽlĂ©s utilisaient une stratĂ©gie « Generous TFT », tel que prĂ©dit par la thĂ©orie, alors que les oiseaux impulsifs choisissaient de coopĂ©rer avec une probabilitĂ© fixe, laquelle Ă©tait indĂ©pendante de la dĂ©cision prĂ©cĂ©demment prise par le partenaire. Si l’incapacitĂ© des individus impulsifs Ă  utiliser des stratĂ©gies rĂ©actionnelles est due Ă  une capacitĂ© de la mĂ©moire de travail rĂ©duite, nos rĂ©sultats pourraient alors contribuer Ă  expliquer les diffĂ©rences interspĂ©cifiques qui existent au niveau des comportements coopĂ©ratifs.Reciprocal altruism, the most probable mechanism for cooperation among unrelated individuals, can be modelled as a Prisoner’s Dilemma. This game predicts that cooperation should evolve whenever the players, who expect to interact repeatedly, adopt conditional strategies. Yet, experimental data suggest that reciprocity would be rare in animal societies, maybe because animals, compared to humans, are very impulsive. Several studies examining the effect of impulsiveness on cooperation have indeed found a negative impact of temporal discounting. On the other hand, the role of impulsive action, another facet of impulsiveness, remains unexplored, though it could also impede cooperation by affecting the capacity of individuals to flexibly adjust their behaviour to their partner’s decision. To address this hypothesis, we conducted an experiment with zebra finches (Taenyopigia guttata) that were paired assortatively with respect to their level of impulsive action and then played an Alternating Prisoner’s Dilemma. As anticipated, we found that mutual cooperation occurred more frequently between self-controlled partners than between impulsive ones, a difference that was caused by differences in the strategy used by both types of individuals. Specifically, self-controlled individuals used a Generous TFT strategy, as predicted by theory, whereas impulsive birds chose to cooperate with a fixed probability, which was independent of their partner’s previous decision. If the inability of impulsive individuals to use reactive strategies are due to their reduced working memory capacity, our findings might contribute to explaining interspecific differences in cooperative behaviour

    Effet d'un stress prolongé sur les capacités de mémorisation et les comportements de coopération chez le diamant mandarin (Taeniopygia guttata)

    Full text link
    Mémoire numérisé par la Division de la gestion de documents et des archives de l'Université de Montréal

    L'établissement de liens sociaux durables favorise la coopération dans le Dilemme du Prisonnier itéré

    Full text link
    Mémoire numérisé par la Division de la gestion de documents et des archives de l'Université de Montréal

    A Common Protocol for Agent-Based Social Simulation

    Get PDF
    Traditional (i.e. analytical) modelling practices in the social sciences rely on a very well established, although implicit, methodological protocol, both with respect to the way models are presented and to the kinds of analysis that are performed. Unfortunately, computer-simulated models often lack such a reference to an accepted methodological standard. This is one of the main reasons for the scepticism among mainstream social scientists that results in low acceptance of papers with agent-based methodology in the top journals. We identify some methodological pitfalls that, according to us, are common in papers employing agent-based simulations, and propose appropriate solutions. We discuss each issue with reference to a general characterization of dynamic micro models, which encompasses both analytical and simulation models. In the way, we also clarify some confusing terminology. We then propose a three-stage process that could lead to the establishment of methodological standards in social and economic simulations.Agent-Based, Simulations, Methodology, Calibration, Validation, Sensitivity Analysis

    Meta-Stability of Interacting Adaptive Agents

    Get PDF
    The adaptive process can be considered as being driven by two fundamental forces: exploitation and exploration. While the explorative process may be deterministic, the resultant effect may be stochastic. Stochastic effects may also exist in the expoitative process. This thesis considers the effects of stochastic fluctuations inherent in the adaptive process on the behavioural dynamics of a population of interacting agents. It is hypothesied that in such systems, one or more attractors in the population space exist; and that transitions between these attractors can occur; either as a result of internal shocks (sampling fluctuations) or external shocks (environmental changes). It is further postulated that such transitions in the (microscopic) population space may be observable as phase transitions in the behaviour of macroscopic observables. A simple model of a stock market, driven by asexual reproduction (selection plus mutation) is put forward as a testbed. A statistical dynamics analysis of the behaviour of this market is then developed. Fixed points in the space of agent behaviours are located, and market dynamics are compared to the analytic predictions. Additionally, an analysis of the relative importance of internal shocks(sampling fluctuations) and external shocks( the stock dividend sequence) across varying population size is presented

    Essays on Bounded Rationality and Strategic Behavior in Experimental and Computational Economics.

    Full text link
    Chapter 1 evaluates coordination among agents in environments with congestion effects. This paper discusses how people implicitly learn to coordinate their actions when such coordination is beneficial but difficult. During a series of experiments involving human subjects and simulated agents, subjects repeatedly update their strategies during play of the El Farol Bar Game. A subject is able to partially observe her opponents’ previous strategies and payoffs before setting her strategy for the next round of play. Play did not converge to the stage game’s pure strategy Nash equilibrium. Also, subjects routinely imitated the most successful strategies. This flocking behavior led to socially inefficient outcomes. Economic agents often face situations in which they must simultaneously interact in a variety of strategic environments, and yet they have only limited cognitive resources to compete in these varied settings. Chapters 2 through 4 consider how boundedly rational agents allocate scarce cognitive resources in strategic environments characterized by multiple simultaneously played games. Chapter 2 builds a framework that encapsulates a complex adaptive system defined by finite automaton strategies. Chapter 3 considers the evolution of strategies in the presence of cognitive costs in both single-game and multiple-game settings. When facing costs, a player’s strategy population quickly converges to a largely homogenous pool of rather simplified strategies that utilize only 14 percent of their cognitive power. There is evidence of both positive and negative strategic complementarities in the two game environments. Strategies perform better in each game within the two-game {Prisoner’s Dilemma, Stag Hunt} setting than they do when playing each game individually. Conversely, performance is impaired in each game of the two-game {Stag Hunt, Chicken} environment relative to the single game settings. Chapter 4 uses the evolved strategies to evaluate the impact of experience in multiple game environments. Experience in Prisoner’s Dilemma translates well into other games in two-game environments, while experience in Stag Hunt handicaps performance in other games. In multiple game settings, since a strategy’s actions are applied in different games, the context of actions is important. Chapters 3 and 4 address this issue by comparing the natural outcome context to the cooperate/defect context.Ph.D.EconomicsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/86359/1/leady_1.pd

    Catgame: A Tool For Problem Solving In Complex Dynamic Systems Using Game Theoretic Knowledge Distribution In Cultural Algorithms, And Its Application (catneuro) To The Deep Learning Of Game Controller

    Get PDF
    Cultural Algorithms (CA) are knowledge-intensive, population-based stochastic optimization methods that are modeled after human cultures and are suited to solving problems in complex environments. The CA Belief Space stores knowledge harvested from prior generations and re-distributes it to future generations via a knowledge distribution (KD) mechanism. Each of the population individuals is then guided through the search space via the associated knowledge. Previously, CA implementations have used only competitive KD mechanisms that have performed well for problems embedded in static environments. Relatively recently, CA research has evolved to encompass dynamic problem environments. Given increasing environmental complexity, a natural question arises about whether KD mechanisms that also incorporate cooperation can perform better in such environments than purely competitive ones? Borrowing from game theory, game-based KD mechanisms are implemented and tested against the default competitive mechanism – Weighted Majority (WTD). Two different concepts of complexity are addressed – numerical optimization under dynamic environments and hierarchal, multi-objective optimization for evolving deep learning models. The former is addressed with the CATGame software system and the later with CATNeuro. CATGame implements three types of games that span both cooperation and competition for knowledge distribution, namely: Iterated Prisoner\u27s Dilemma (IPD), Stag-Hunt and Stackelberg. The performance of the three game mechanisms is compared with the aid of a dynamic problem generator called Cones World. Weighted Majority, aka “wisdom of the crowd”, the default CA competitive KD mechanism is used as the benchmark. It is shown that games that support both cooperation and competition do indeed perform better but not in all cases. The results shed light on what kinds of games are suited to problem solving in complex, dynamic environments. Specifically, games that balance exploration and exploitation using the local signal of ‘social’ rank – Stag-Hunt and IPD – perform better. Stag-Hunt which is also the most cooperative of the games tested, performed the best overall. Dynamic analysis of the ‘social’ aspects of the CA test runs shows that Stag-Hunt allocates compute resources more consistently than the others in response to environmental complexity changes. Stackelberg where the allocation decisions are centralized, like in a centrally planned economic system, is found to be the least adaptive. CATNeuro is for solving neural architecture search (NAS) problems. Contemporary ‘deep learning’ neural network models are proven effective. However, the network topologies may be complex and not immediately obvious for the problem at hand. This has given rise to the secondary field of neural architecture search. It is still nascent with many frameworks and approaches now becoming available. This paper describes a NAS method based on graph evolution pioneered by NEAT (Neuroevolution of Augmenting Topologies) but driven by the evolutionary mechanisms under Cultural Algorithms. Here CATNeuro is applied to find optimal network topologies to play a 2D fighting game called FightingICE (derived from “The Rumble Fish” video game). A policy-based, reinforcement learning method is used to create the training data for network optimization. CATNeuro is still evolving. To inform the development of CATNeuro, in this primary foray into NAS, we contrast the performance of CATNeuro with two different knowledge distribution mechanisms – the stalwart Weighted Majority and a new one based on the Stag-Hunt game from evolutionary game theory that performed the best in CATGame. The research shows that Stag-Hunt has a distinct edge over WTD in terms of game performance, model accuracy, and model size. It is therefore deemed to be the preferred mechanism for complex, hierarchical optimization tasks such as NAS and is planned to be used as the default KD mechanism in CATNeuro going forward
    • 

    corecore