3 research outputs found
Pre-trained Word Embeddings for Goal-conditional Transfer Learning in Reinforcement Learning
Reinforcement learning (RL) algorithms typically start tabula rasa, without
any prior knowledge of the environment, and without any prior skills. This
however often leads to low sample efficiency, requiring a large amount of
interaction with the environment. This is especially true in a lifelong
learning setting, in which the agent needs to continually extend its
capabilities. In this paper, we examine how a pre-trained task-independent
language model can make a goal-conditional RL agent more sample efficient. We
do this by facilitating transfer learning between different related tasks. We
experimentally demonstrate our approach on a set of object navigation tasks.Comment: Paper accepted to the ICML 2020 Language in Reinforcement Learning
(LaReL) Worksho
Diverse Exploration via InfoMax Options
In this paper, we study the problem of autonomously discovering temporally
abstracted actions, or options, for exploration in reinforcement learning. For
learning diverse options suitable for exploration, we introduce the infomax
termination objective defined as the mutual information between options and
their corresponding state transitions. We derive a scalable optimization scheme
for maximizing this objective via the termination condition of options,
yielding the InfoMax Option Critic (IMOC) algorithm. Through illustrative
experiments, we empirically show that IMOC learns diverse options and utilizes
them for exploration. Moreover, we show that IMOC scales well to continuous
control tasks.Comment: Preprint. Under revie