53,523 research outputs found
CURIOUS: Intrinsically Motivated Modular Multi-Goal Reinforcement Learning
In open-ended environments, autonomous learning agents must set their own
goals and build their own curriculum through an intrinsically motivated
exploration. They may consider a large diversity of goals, aiming to discover
what is controllable in their environments, and what is not. Because some goals
might prove easy and some impossible, agents must actively select which goal to
practice at any moment, to maximize their overall mastery on the set of
learnable goals. This paper proposes CURIOUS, an algorithm that leverages 1) a
modular Universal Value Function Approximator with hindsight learning to
achieve a diversity of goals of different kinds within a unique policy and 2)
an automated curriculum learning mechanism that biases the attention of the
agent towards goals maximizing the absolute learning progress. Agents focus
sequentially on goals of increasing complexity, and focus back on goals that
are being forgotten. Experiments conducted in a new modular-goal robotic
environment show the resulting developmental self-organization of a learning
curriculum, and demonstrate properties of robustness to distracting goals,
forgetting and changes in body properties.Comment: Accepted at ICML 201
Neural Task Programming: Learning to Generalize Across Hierarchical Tasks
In this work, we propose a novel robot learning framework called Neural Task
Programming (NTP), which bridges the idea of few-shot learning from
demonstration and neural program induction. NTP takes as input a task
specification (e.g., video demonstration of a task) and recursively decomposes
it into finer sub-task specifications. These specifications are fed to a
hierarchical neural program, where bottom-level programs are callable
subroutines that interact with the environment. We validate our method in three
robot manipulation tasks. NTP achieves strong generalization across sequential
tasks that exhibit hierarchal and compositional structures. The experimental
results show that NTP learns to generalize well to- wards unseen tasks with
increasing lengths, variable topologies, and changing objectives.Comment: ICRA 201
- …