10,392 research outputs found
Learning in rent-seeking contests with payoff risk and foregone payoff Information
We test whether deviations from Nash equilibrium in rent-seeking contests can be explained by the slow convergence of payoff-based learning. We identify and eliminate two noise sources that slow down learning: first, opponents are changing their actions across rounds; second, payoffs are probabilistic, which reduces the correlation between expected and realized payoffs. We find that average choices are not significantly different from the risk-neutral Nash equilibrium predictions only when both noise sources are eliminated by supplying foregone payoff information and removing payoff risk. Payoff-based learning can explain these results better than alternative theories. We propose a hybrid learning model that combines reinforcement and belief learning with risk, social and other preferences, and show that it fits data well, mostly because of reinforcement learning
Experience-weighted Attraction Learning in Normal Form Games
In ‘experience-weighted attraction’ (EWA) learning, strategies have attractions that reflect initial predispositions, are updated based on payoff experience, and determine choice probabilities according to some rule (e.g., logit). A key feature is a parameter δ that weights the strength of hypothetical reinforcement of strategies that were not chosen according to the payoff they would have yielded, relative to reinforcement of chosen strategies according to received payoffs. The other key features are two discount rates, φ and ρ, which separately discount previous attractions, and an experience weight. EWA includes reinforcement learning and weighted fictitious play (belief learning) as special cases, and hybridizes their key elements. When δ= 0 and ρ= 0, cumulative choice reinforcement results. When δ= 1 and ρ=φ, levels of reinforcement of strategies are exactly the same as expected payoffs given weighted fictitious play beliefs. Using three sets of experimental data, parameter estimates of the model were calibrated on part of the data and used to predict a holdout sample. Estimates of δ are generally around .50, φ around .8 − 1, and ρ varies from 0 to φ. Reinforcement and belief-learning special cases are generally rejected in favor of EWA, though belief models do better in some constant-sum games. EWA is able to combine the best features of previous approaches, allowing attractions to begin and grow flexibly as choice reinforcement does, but reinforcing unchosen strategies substantially as belief-based models implicitly do
Self-tuning experience weighted attraction learning in games
Self-tuning experience weighted attraction (EWA) is a one-parameter theory of learning in
games. It addresses a criticism that an earlier model (EWA) has too many parameters, by
fixing some parameters at plausible values and replacing others with functions of experience
so that they no longer need to be estimated. Consequently, it is econometrically simpler
than the popular weighted fictitious play and reinforcement learning models.
The functions of experience which replace free parameters “self-tune” over time, adjusting
in a way that selects a sensible learning rule to capture subjects’ choice dynamics. For
instance, the self-tuning EWA model can turn from a weighted fictitious play into an averaging
reinforcement learning as subjects equilibrate and learn to ignore inferior foregone
payoffs. The theory was tested on seven different games, and compared to the earlier parametric
EWA model and a one-parameter stochastic equilibrium theory (QRE). Self-tuning
EWA does as well as EWA in predicting behavior in new games, even though it has fewer
parameters, and fits reliably better than the QRE equilibrium benchmark
Cooperation Enforcement and Collusion Resistance in Repeated Public Goods Games
Enforcing cooperation among substantial agents is one of the main objectives
for multi-agent systems. However, due to the existence of inherent social
dilemmas in many scenarios, the free-rider problem may arise during agents'
long-run interactions and things become even severer when self-interested
agents work in collusion with each other to get extra benefits. It is commonly
accepted that in such social dilemmas, there exists no simple strategy for an
agent whereby she can simultaneously manipulate on the utility of each of her
opponents and further promote mutual cooperation among all agents. Here, we
show that such strategies do exist. Under the conventional repeated public
goods game, we novelly identify them and find that, when confronted with such
strategies, a single opponent can maximize his utility only via global
cooperation and any colluding alliance cannot get the upper hand. Since a full
cooperation is individually optimal for any single opponent, a stable
cooperation among all players can be achieved. Moreover, we experimentally show
that these strategies can still promote cooperation even when the opponents are
both self-learning and collusive
- …