168,551 research outputs found

    Learning with bounded memory.

    Get PDF
    The paper studies infinite repetition of finite strategic form games. Players use a learning behavior and face bounds on their cognitive capacities. We show that for any given beliefprobability over the set of possible outcomes where players have no experience. games can be payoff classified and there always exists a stationary state in the space of action profiles. In particular, if the belief-probability assumes all possible outcomes without experience to be equally likely, in one class of Prisoners' Dilemmas where the average defecting payoff is higher than the cooperative payoff and the average cooperative payoff is lower than the defecting payoff, play converges in the long run to the static Nash equilibrium while in the other class of Prisoners' Dilemmas where the reserve holds, play converges to cooperation. Results are applied to a large class of 2 x 2 games.Cognitive complexity; Bounded logistic quantal response learning; Long run outcomes;

    Learning with bounded memory

    Get PDF
    The paper studies infinite repetition of finite strategic form games. Players use a learning behavior and face bounds on their cognitive capacities. We show that for any given beliefprobability over the set of possible outcomes where players have no experience. games can be payoff classified and there always exists a stationary state in the space of action profiles. In particular, if the belief-probability assumes all possible outcomes without experience to be equally likely, in one class of Prisoners' Dilemmas where the average defecting payoff is higher than the cooperative payoff and the average cooperative payoff is lower than the defecting payoff, play converges in the long run to the static Nash equilibrium while in the other class of Prisoners' Dilemmas where the reserve holds, play converges to cooperation. Results are applied to a large class of 2 x 2 games

    Inferring to C or not to C: Evolutionary games with Bayesian inferential strategies

    Full text link
    Strategies for sustaining cooperation and preventing exploitation by selfish agents in repeated games have mostly been restricted to Markovian strategies where the response of an agent depends on the actions in the previous round. Such strategies are characterized by lack of learning. However, learning from accumulated evidence over time and using the evidence to dynamically update our response is a key feature of living organisms. Bayesian inference provides a framework for such evidence-based learning mechanisms. It is therefore imperative to understand how strategies based on Bayesian learning fare in repeated games with Markovian strategies. Here, we consider a scenario where the Bayesian player uses the accumulated evidence of the opponent's actions over several rounds to continuously update her belief about the reactive opponent's strategy. The Bayesian player can then act on her inferred belief in different ways. By studying repeated Prisoner's dilemma games with such Bayesian inferential strategies, both in infinite and finite populations, we identify the conditions under which such strategies can be evolutionarily stable. We find that a Bayesian strategy that is less altruistic than the inferred belief about the opponent's strategy can outperform a larger set of reactive strategies, whereas one that is more generous than the inferred belief is more successful when the benefit-to-cost ratio of mutual cooperation is high. Our analysis reveals how learning the opponent's strategy through Bayesian inference, as opposed to utility maximization, can be beneficial in the long run, in preventing exploitation and eventual invasion by reactive strategies.Comment: 13 pages, 9 figure

    Faithfulness-boost effect: Loyal teammate selection correlates with skill acquisition improvement in online games

    Get PDF
    The problem of skill acquisition is ubiquitous and fundamental to life. Most tasks in modern society involve the cooperation with other subjects. Notwithstanding its fundamental importance, teammate selection is commonly overlooked when studying learning. We exploit the virtually infinite repository of human behavior available in Internet to study a relevant topic in anthropological science: how grouping strategies may affect learning. We analyze the impact of team play strategies in skill acquisition using a turn-based game where players can participate individually or in teams. We unveil a subtle but strong effect in skill acquisition based on the way teams are formed and maintained during time. “Faithfulness-boost effect” provides a skill boost during the first games that would only be acquired after thousands of games. The tendency to play games in teams is associated with a long-run skill improvement while playing loyally with the same teammate significantly accelerates short-run skill acquisition.Fil: Landfried, Gustavo Andrés. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Computación; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Investigación en Ciencias de la Computación. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Investigación en Ciencias de la Computación; ArgentinaFil: Slezak, Diego Fernández. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Computación; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Investigación en Ciencias de la Computación. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Investigación en Ciencias de la Computación; ArgentinaFil: Mocskos, Esteban Eduardo. Consejo Nacional de Investigaciones Científicas y Técnicas. Ctro de Simulación Computacional P/aplicaciones Tecnologicas; Argentina. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Computación; Argentin

    Learning partner selection rules that sustain cooperation in social dilemmas with the option of opting out

    Get PDF
    We study populations of self-interested agents playing a 2-person repeated Prisoner’s Dilemma game, with each player having the option of opting out of the interaction and choosing to be randomly assigned to another partner instead. The partner selection component makes these games akin to random matching, where defection is known to take over the entire population. Results in the literature have shown that, when forcing agents to obey a set partner selection rule known as Out-for-Tat, where defectors are systematically being broken ties with, cooperation can be sustained in the long run. In this paper, we remove this assumption and study agents that learn both action- and partner-selection strategies. Through multiagent reinforcement learning, we show that cooperation can be sustained without forcing agents to play predetermined strategies. Our simulations show that agents are capable of learning in-game strategies by themselves, such as Tit-for-Tat. What is more, they are also able to simultaneously discover cooperation-sustaining partner selection rules, notably Out-for-Tat, as well as other new rules that make cooperation prevail

    Cooperation Enforcement and Collusion Resistance in Repeated Public Goods Games

    Full text link
    Enforcing cooperation among substantial agents is one of the main objectives for multi-agent systems. However, due to the existence of inherent social dilemmas in many scenarios, the free-rider problem may arise during agents' long-run interactions and things become even severer when self-interested agents work in collusion with each other to get extra benefits. It is commonly accepted that in such social dilemmas, there exists no simple strategy for an agent whereby she can simultaneously manipulate on the utility of each of her opponents and further promote mutual cooperation among all agents. Here, we show that such strategies do exist. Under the conventional repeated public goods game, we novelly identify them and find that, when confronted with such strategies, a single opponent can maximize his utility only via global cooperation and any colluding alliance cannot get the upper hand. Since a full cooperation is individually optimal for any single opponent, a stable cooperation among all players can be achieved. Moreover, we experimentally show that these strategies can still promote cooperation even when the opponents are both self-learning and collusive

    Reinforcement Learning Dynamics in Social Dilemmas

    Get PDF
    In this paper we replicate and advance Macy and Flache\'s (2002; Proc. Natl. Acad. Sci. USA, 99, 7229–7236) work on the dynamics of reinforcement learning in 2�2 (2-player 2-strategy) social dilemmas. In particular, we provide further insight into the solution concepts that they describe, illustrate some recent analytical results on the dynamics of their model, and discuss the robustness of such results to occasional mistakes made by players in choosing their actions (i.e. trembling hands). It is shown here that the dynamics of their model are strongly dependent on the speed at which players learn. With high learning rates the system quickly reaches its asymptotic behaviour; on the other hand, when learning rates are low, two distinctively different transient regimes can be clearly observed. It is shown that the inclusion of small quantities of randomness in players\' decisions can change the dynamics of the model dramatically.Reinforcement Learning; Replication; Game Theory; Social Dilemmas; Agent-Based; Slow Learning
    corecore