853 research outputs found
Goal-Conditioned Reinforcement Learning with Imagined Subgoals
Goal-conditioned reinforcement learning endows an agent with a large variety
of skills, but it often struggles to solve tasks that require more temporally
extended reasoning. In this work, we propose to incorporate imagined subgoals
into policy learning to facilitate learning of complex tasks. Imagined subgoals
are predicted by a separate high-level policy, which is trained simultaneously
with the policy and its critic. This high-level policy predicts intermediate
states halfway to the goal using the value function as a reachability metric.
We don't require the policy to reach these subgoals explicitly. Instead, we use
them to define a prior policy, and incorporate this prior into a KL-constrained
policy iteration scheme to speed up and regularize learning. Imagined subgoals
are used during policy learning, but not during test time, where we only apply
the learned policy. We evaluate our approach on complex robotic navigation and
manipulation tasks and show that it outperforms existing methods by a large
margin.Comment: ICML 2021. See the project webpage at
https://www.di.ens.fr/willow/research/ris
Towards Continual Reinforcement Learning: A Review and Perspectives
In this article, we aim to provide a literature review of different
formulations and approaches to continual reinforcement learning (RL), also
known as lifelong or non-stationary RL. We begin by discussing our perspective
on why RL is a natural fit for studying continual learning. We then provide a
taxonomy of different continual RL formulations and mathematically characterize
the non-stationary dynamics of each setting. We go on to discuss evaluation of
continual RL agents, providing an overview of benchmarks used in the literature
and important metrics for understanding agent performance. Finally, we
highlight open problems and challenges in bridging the gap between the current
state of continual RL and findings in neuroscience. While still in its early
days, the study of continual RL has the promise to develop better incremental
reinforcement learners that can function in increasingly realistic applications
where non-stationarity plays a vital role. These include applications such as
those in the fields of healthcare, education, logistics, and robotics.Comment: Preprint, 52 pages, 8 figure
Stabilizing Contrastive RL: Techniques for Offline Goal Reaching
In the same way that the computer vision (CV) and natural language processing
(NLP) communities have developed self-supervised methods, reinforcement
learning (RL) can be cast as a self-supervised problem: learning to reach any
goal, without requiring human-specified rewards or labels. However, actually
building a self-supervised foundation for RL faces some important challenges.
Building on prior contrastive approaches to this RL problem, we conduct careful
ablation experiments and discover that a shallow and wide architecture,
combined with careful weight initialization and data augmentation, can
significantly boost the performance of these contrastive RL approaches on
challenging simulated benchmarks. Additionally, we demonstrate that, with these
design decisions, contrastive approaches can solve real-world robotic
manipulation tasks, with tasks being specified by a single goal image provided
after training
- …