704 research outputs found

    Approximating n-player behavioural strategy nash equilibria using coevolution

    Get PDF
    Coevolutionary algorithms are plagued with a set of problems related to intransitivity that make it questionable what the end product of a coevolutionary run can achieve. With the introduction of solution concepts into coevolution, part of the issue was alleviated, however efficiently representing and achieving game theoretic solution concepts is still not a trivial task. In this paper we propose a coevolutionary algorithm that approximates behavioural strategy Nash equilibria in n-player zero sum games, by exploiting the minimax solution concept. In order to support our case we provide a set of experiments in both games of known and unknown equilibria. In the case of known equilibria, we can confirm our algorithm converges to the known solution, while in the case of unknown equilibria we can see a steady progress towards Nash. Copyright 2011 ACM

    Discrete stochastic processes, replicator and Fokker-Planck equations of coevolutionary dynamics in finite and infinite populations

    Full text link
    Finite-size fluctuations in coevolutionary dynamics arise in models of biological as well as of social and economic systems. This brief tutorial review surveys a systematic approach starting from a stochastic process discrete both in time and state. The limit NN\to \infty of an infinite population can be considered explicitly, generally leading to a replicator-type equation in zero order, and to a Fokker-Planck-type equation in first order in 1/N1/\sqrt{N}. Consequences and relations to some previous approaches are outlined.Comment: Banach Center publications, in pres

    Theoretical advantages of lenient learners : an evolutionary game theoretic perspective

    Get PDF
    This paper presents the dynamics of multiple learning agents from an evolutionary game theoretic perspective. We provide replicator dynamics models for cooperative coevolutionary algorithms and for traditional multiagent Q-learning, and we extend these differential equations to account for lenient learners: agents that forgive possible mismatched teammate actions that resulted in low rewards. We use these extended formal models to study the convergence guarantees for these algorithms, and also to visualize the basins of attraction to optimal and suboptimal solutions in two benchmark coordination problems. The paper demonstrates that lenience provides learners with more accurate information about the benefits of performing their actions, resulting in higher likelihood of convergence to the globally optimal solution. In addition, the analysis indicates that the choice of learning algorithm has an insignificant impact on the overall performance of multiagent learning algorithms; rather, the performance of these algorithms depends primarily on the level of lenience that the agents exhibit to one another. Finally, the research herein supports the strength and generality of evolutionary game theory as a backbone for multiagent learning

    A visual demonstration of convergence properties of cooperative coevolution

    Get PDF
    We introduce a model for cooperative coevolutionary algorithms (CCEAs) using partial mixing, which allows us to compute the expected long-run convergence of such algorithms when individuals ’ fitness is based on the maximum payoff of some N evaluations with partners chosen at random from the other population. Using this model, we devise novel visualization mechanisms to attempt to qualitatively explain a difficult-to-conceptualize pathology in CCEAs: the tendency for them to converge to suboptimal Nash equilibria. We further demonstrate visually how increasing the size of N, or biasing the fitness to include an ideal-collaboration factor, both improve the likelihood of optimal convergence, and under which initial population configurations they are not much help

    Open-ended Learning in Symmetric Zero-sum Games

    Get PDF
    Zero-sum games such as chess and poker are, abstractly, functions that evaluate pairs of agents, for example labeling them `winner' and `loser'. If the game is approximately transitive, then self-play generates sequences of agents of increasing strength. However, nontransitive games, such as rock-paper-scissors, can exhibit strategic cycles, and there is no longer a clear objective -- we want agents to increase in strength, but against whom is unclear. In this paper, we introduce a geometric framework for formulating agent objectives in zero-sum games, in order to construct adaptive sequences of objectives that yield open-ended learning. The framework allows us to reason about population performance in nontransitive games, and enables the development of a new algorithm (rectified Nash response, PSRO_rN) that uses game-theoretic niching to construct diverse populations of effective agents, producing a stronger set of agents than existing algorithms. We apply PSRO_rN to two highly nontransitive resource allocation games and find that PSRO_rN consistently outperforms the existing alternatives.Comment: ICML 2019, final versio

    Autonomous virulence adaptation improves coevolutionary optimization

    Get PDF

    Fostering cooperation through dynamic coalition formation and partner switching

    Get PDF
    In this article we tackle the problem of maximizing cooperation among self-interested agents in a resource exchange environment. Our main concern is the design of mechanisms for maximizing cooperation among self-interested agents in a way that their profits increase by exchanging or trading with resources. Although dynamic coalition formation and partner switching (rewiring) have been shown to promote the emergence and maintenance of cooperation for self-interested agents, no prior work in the literature has investigated whether merging both mechanisms exhibits positive synergies that lead to increase cooperation even further. Therefore, we introduce and analyze a novel dynamic coalition formation mechanism, that uses partner switching, to help self-interested agents to increase their profits in a resource exchange environment. Our experiments show the effectiveness of our mechanism at increasing the agents' profits, as well as the emergence of trading as the preferred behavior over different types of complex networks. © 2014 ACM.The first author thanks the grant Formación de Profesorado Universitario (FPU), reference AP2010-1742. J.Ll.A. and J.A.R-A are partially funded by projects EVE (TIN2009-14702-C02-01), AT (CSD2007-0022), COR (TIN2012-38876-C02-01), MECER (201250E053), and the Generalitat of Catalunya grant 2009-SGR-1434Peer Reviewe
    corecore