This paper extends the classic two-armed bandit problem to a many-agent setting in which N players each face the same experimentation problem. The difference with the single-agent problem is that agents can now learn from the experiments of others. Thus, experiementation produces a public good and a free-rider problem in experimentation naturally arises. More interestingly, future experimentation by others encourages current individual experimentation. The paper provides an analysis of the set of Markov equilibria in terms of the free-rider effect and the encouragement effect
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.