research

Learning, Experimentation, and Long-Run Behavior in Games

Abstract

This paper investigates a class of population-learning dynamics. In every period agents either adopt a best reply to the current distribution of actual play, or a best reply to a sample, taken with replacement, from the distribution of intended play (the strategies adopted at the end of last period), or they are inactive. If sampling with replacement and being inactive have strictly positive probability, these dynamics converge globally to minimal curb sets in the absence of mistakes. For two-player i x j-games, i; j .le. 3; the same result holds even if only best responding to actual play and being inactive have positive probability. If players make mistakes in the implementation of their strategies, these dynamics select among minimal curb sets .

    Similar works