Adapting to regularities of the environment is critical for biological
organisms to anticipate events and plan. A prominent example is the circadian
rhythm corresponding to the internalization by organisms of the 24-hour
period of the Earth's rotation. In this work, we study the emergence of
circadian-like rhythms in deep reinforcement learning agents. In particular, we
deployed agents in an environment with a reliable periodic variation while
solving a foraging task. We systematically characterize the agent's behavior
during learning and demonstrate the emergence of a rhythm that is endogenous
and entrainable. Interestingly, the internal rhythm adapts to shifts in the
phase of the environmental signal without any re-training. Furthermore, we show
via bifurcation and phase response curve analyses how artificial neurons
develop dynamics to support the internalization of the environmental rhythm.
From a dynamical systems view, we demonstrate that the adaptation proceeds by
the emergence of a stable periodic orbit in the neuron dynamics with a phase
response that allows an optimal phase synchronisation between the agent's
dynamics and the environmental rhythm.Comment: ICML 202