There has been intense interest in hierarchical reinforcement learning as a way to make Markov decision process planning more tractable, but there has been relatively little work on autonomously learning the hierarchy, especially in continuous domains. In this paper we present a method for learning a hierarchy of actions in a continuous environment. Our approach is to learn a qualitative representation of the continuous environment and then to define actions to reach qualitative states. Our method learns one or more options to perform each action. Each option is learned by first learning a dynamic Bayesian network (DBN). We approach this problem from a developmental robotics perspective. The agent receives no extrinsic reward and has no external direction for what to learn. We evaluate our work using a simulation with realistic physics that consists of a robot playing with blocks at a table.
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.