168,551 research outputs found
Learning with bounded memory.
The paper studies infinite repetition of finite strategic form games. Players use a learning behavior and face bounds on their cognitive capacities. We show that for any given beliefprobability over the set of possible outcomes where players have no experience. games can be payoff classified and there always exists a stationary state in the space of action profiles. In particular, if the belief-probability assumes all possible outcomes without experience to be equally likely, in one class of Prisoners' Dilemmas where the average defecting payoff is higher than the cooperative payoff and the average cooperative payoff is lower than the defecting payoff, play converges in the long run to the static Nash equilibrium while in the other class of Prisoners' Dilemmas where the reserve holds, play converges to cooperation. Results are applied to a large class of 2 x 2 games.Cognitive complexity; Bounded logistic quantal response learning; Long run outcomes;
Learning with bounded memory
The paper studies infinite repetition of finite strategic form games. Players use a learning behavior and face bounds on their cognitive capacities. We show that for any given beliefprobability over the set of possible outcomes where players have no experience. games can be payoff classified and there always exists a stationary state in the space of action profiles. In particular, if the belief-probability assumes all possible outcomes without experience to be equally likely, in one class of Prisoners' Dilemmas where the average defecting payoff is higher than the cooperative payoff and the average cooperative payoff is lower than the defecting payoff, play converges in the long run to the static Nash equilibrium while in the other class of Prisoners' Dilemmas where the reserve holds, play converges to cooperation. Results are applied to a large class of 2 x 2 games
Inferring to C or not to C: Evolutionary games with Bayesian inferential strategies
Strategies for sustaining cooperation and preventing exploitation by selfish
agents in repeated games have mostly been restricted to Markovian strategies
where the response of an agent depends on the actions in the previous round.
Such strategies are characterized by lack of learning. However, learning from
accumulated evidence over time and using the evidence to dynamically update our
response is a key feature of living organisms. Bayesian inference provides a
framework for such evidence-based learning mechanisms. It is therefore
imperative to understand how strategies based on Bayesian learning fare in
repeated games with Markovian strategies. Here, we consider a scenario where
the Bayesian player uses the accumulated evidence of the opponent's actions
over several rounds to continuously update her belief about the reactive
opponent's strategy. The Bayesian player can then act on her inferred belief in
different ways. By studying repeated Prisoner's dilemma games with such
Bayesian inferential strategies, both in infinite and finite populations, we
identify the conditions under which such strategies can be evolutionarily
stable. We find that a Bayesian strategy that is less altruistic than the
inferred belief about the opponent's strategy can outperform a larger set of
reactive strategies, whereas one that is more generous than the inferred belief
is more successful when the benefit-to-cost ratio of mutual cooperation is
high. Our analysis reveals how learning the opponent's strategy through
Bayesian inference, as opposed to utility maximization, can be beneficial in
the long run, in preventing exploitation and eventual invasion by reactive
strategies.Comment: 13 pages, 9 figure
Faithfulness-boost effect: Loyal teammate selection correlates with skill acquisition improvement in online games
The problem of skill acquisition is ubiquitous and fundamental to life. Most tasks in modern society involve the cooperation with other subjects. Notwithstanding its fundamental importance, teammate selection is commonly overlooked when studying learning. We exploit the virtually infinite repository of human behavior available in Internet to study a relevant topic in anthropological science: how grouping strategies may affect learning. We analyze the impact of team play strategies in skill acquisition using a turn-based game where players can participate individually or in teams. We unveil a subtle but strong effect in skill acquisition based on the way teams are formed and maintained during time. “Faithfulness-boost effect” provides a skill boost during the first games that would only be acquired after thousands of games. The tendency to play games in teams is associated with a long-run skill improvement while playing loyally with the same teammate significantly accelerates short-run skill acquisition.Fil: Landfried, Gustavo Andrés. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Computación; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Investigación en Ciencias de la Computación. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Investigación en Ciencias de la Computación; ArgentinaFil: Slezak, Diego Fernández. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Computación; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Investigación en Ciencias de la Computación. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Investigación en Ciencias de la Computación; ArgentinaFil: Mocskos, Esteban Eduardo. Consejo Nacional de Investigaciones Científicas y Técnicas. Ctro de Simulación Computacional P/aplicaciones Tecnologicas; Argentina. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Computación; Argentin
Learning partner selection rules that sustain cooperation in social dilemmas with the option of opting out
We study populations of self-interested agents playing a 2-person repeated Prisoner’s Dilemma game, with each player having the option of opting out of the interaction and choosing to be randomly assigned to another partner instead. The partner selection component makes these games akin to random matching, where defection is known to take over the entire population. Results in the literature have shown that, when forcing agents to obey a set partner selection rule known as Out-for-Tat, where defectors are systematically being broken ties with, cooperation can be sustained in the long run. In this paper, we remove this assumption and study agents that learn both action- and partner-selection strategies. Through multiagent reinforcement learning, we show that cooperation can be sustained without forcing agents to play predetermined strategies. Our simulations show that agents are capable of learning in-game strategies by themselves, such as Tit-for-Tat. What is more, they are also able to simultaneously discover cooperation-sustaining partner selection rules, notably Out-for-Tat, as well as other new rules that make cooperation prevail
Cooperation Enforcement and Collusion Resistance in Repeated Public Goods Games
Enforcing cooperation among substantial agents is one of the main objectives
for multi-agent systems. However, due to the existence of inherent social
dilemmas in many scenarios, the free-rider problem may arise during agents'
long-run interactions and things become even severer when self-interested
agents work in collusion with each other to get extra benefits. It is commonly
accepted that in such social dilemmas, there exists no simple strategy for an
agent whereby she can simultaneously manipulate on the utility of each of her
opponents and further promote mutual cooperation among all agents. Here, we
show that such strategies do exist. Under the conventional repeated public
goods game, we novelly identify them and find that, when confronted with such
strategies, a single opponent can maximize his utility only via global
cooperation and any colluding alliance cannot get the upper hand. Since a full
cooperation is individually optimal for any single opponent, a stable
cooperation among all players can be achieved. Moreover, we experimentally show
that these strategies can still promote cooperation even when the opponents are
both self-learning and collusive
Reinforcement Learning Dynamics in Social Dilemmas
In this paper we replicate and advance Macy and Flache\'s (2002; Proc. Natl. Acad. Sci. USA, 99, 7229–7236) work on the dynamics of reinforcement learning in 2�2 (2-player 2-strategy) social dilemmas. In particular, we provide further insight into the solution concepts that they describe, illustrate some recent analytical results on the dynamics of their model, and discuss the robustness of such results to occasional mistakes made by players in choosing their actions (i.e. trembling hands). It is shown here that the dynamics of their model are strongly dependent on the speed at which players learn. With high learning rates the system quickly reaches its asymptotic behaviour; on the other hand, when learning rates are low, two distinctively different transient regimes can be clearly observed. It is shown that the inclusion of small quantities of randomness in players\' decisions can change the dynamics of the model dramatically.Reinforcement Learning; Replication; Game Theory; Social Dilemmas; Agent-Based; Slow Learning
- …