101,879 research outputs found
Grounding Language for Transfer in Deep Reinforcement Learning
In this paper, we explore the utilization of natural language to drive
transfer for reinforcement learning (RL). Despite the wide-spread application
of deep RL techniques, learning generalized policy representations that work
across domains remains a challenging problem. We demonstrate that textual
descriptions of environments provide a compact intermediate channel to
facilitate effective policy transfer. Specifically, by learning to ground the
meaning of text to the dynamics of the environment such as transitions and
rewards, an autonomous agent can effectively bootstrap policy learning on a new
domain given its description. We employ a model-based RL approach consisting of
a differentiable planning module, a model-free component and a factorized state
representation to effectively use entity descriptions. Our model outperforms
prior work on both transfer and multi-task scenarios in a variety of different
environments. For instance, we achieve up to 14% and 11.5% absolute improvement
over previously existing models in terms of average and initial rewards,
respectively.Comment: JAIR 201
Pretraining the Vision Transformer using self-supervised methods for vision based Deep Reinforcement Learning
The Vision Transformer architecture has shown to be competitive in the
computer vision (CV) space where it has dethroned convolution-based networks in
several benchmarks. Nevertheless, Convolutional Neural Networks (CNN) remain
the preferential architecture for the representation module in Reinforcement
Learning. In this work, we study pretraining a Vision Transformer using several
state-of-the-art self-supervised methods and assess data-efficiency gains from
this training framework. We propose a new self-supervised learning method
called TOV-VICReg that extends VICReg to better capture temporal relations
between observations by adding a temporal order verification task. Furthermore,
we evaluate the resultant encoders with Atari games in a sample-efficiency
regime. Our results show that the vision transformer, when pretrained with
TOV-VICReg, outperforms the other self-supervised methods but still struggles
to overcome a CNN. Nevertheless, we were able to outperform a CNN in two of the
ten games where we perform a 100k steps evaluation. Ultimately, we believe that
such approaches in Deep Reinforcement Learning (DRL) might be the key to
achieving new levels of performance as seen in natural language processing and
computer vision. Source code will be available at:
https://github.com/mgoulao/TOV-VICRe
- …