We investigate the problem of learning to play the game of rock–paper–scissors. Each player attempts to improve her/his average score by adjusting the frequency of the three possible responses, using reinforcement learning. For the zero sum game the learning process displays Hamiltonian chaos. Thus, the learning trajectory can be simple or complex, depending on initial conditions. We also investigate the non-zero sum case and show that it can give rise to chaotic transients. This is, to our knowledge, the first demonstration of Hamiltonian chaos in learning a basic two-person game, extending earlier findings of chaotic attractors in dissipative systems. As we argue here, chaos provides an important self-consistency condition for determining when players will learn to behave as though they were fully rational. That chaos can occur in learning a simple game indicates one should use caution in assuming real people will learn to play a game according to a Nash equilibrium strategy
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.