research

Strategic Experimentation: The Case of the Poisson Bandits

Abstract

This paper studies a game of strategic experimentation in which the players learn from the experiments of others as well as their own. We first establish the efficient benchmark where the players co-ordinate in order to maximise joint expected payoffs, and then show that, because of free-riding, the strategic problem leads to inefficiently low levels of experimentation in any equilibrium when the players use stationary Markovian strategies. Efficiency can be approximately retrieved provided that the players adopt strategies which slow down the rate at which information is acquired; this is achieved by their taking periodic breaks from experimenting, which get progressively longer. In the public information case (actions and experimental outcomes are both observable), we exhibit a class of non-stationary equilibria in which the ε\varepsilon-efficient amount of experimentation is performed, but only in infinite time. In the private information case (only actions are observable, not outcomes), the breaks have two additional effects: not only do they enable the players to finesse the inference problem, but also they serve to signal their experimental outcome to the other player. We describe an equilibrium with similar non-stationary strategies in which the ε\varepsilon-efficient amount of experimentation is again performed in infinite time, but with a faster rate of information acquisition. The equilibrium rate of information acquisition is slower in the former case because the short-run temptation to free-ride on information acquisition is greater when information is public.

    Similar works