The Effectiveness of World Models for Continual Reinforcement Learning

Bortkiewicz, Michał; Kessler, Samuel; Miłoś, Piotr; Ostaszewski, Mateusz; Parker-Holder, Jack; Roberts, Stephen J.; Wołczyk, Maciej; Żarski, Mateusz

The Effectiveness of World Models for Continual Reinforcement Learning

Authors: Michał Bortkiewicz
Samuel Kessler
Piotr Miłoś
Mateusz Ostaszewski
Jack Parker-Holder
Stephen J. Roberts
Maciej Wołczyk
Mateusz Żarski
Publication date: 12 July 2023
Publisher

Abstract

World models power some of the most efficient reinforcement learning algorithms. In this work, we showcase that they can be harnessed for continual learning - a situation when the agent faces changing environments. World models typically employ a replay buffer for training, which can be naturally extended to continual learning. We systematically study how different selective experience replay methods affect performance, forgetting, and transfer. We also provide recommendations regarding various modeling options for using world models. The best set of choices is called Continual-Dreamer, it is task-agnostic and utilizes the world model for continual exploration. Continual-Dreamer is sample efficient and outperforms state-of-the-art task-agnostic continual reinforcement learning methods on Minigrid and Minihack benchmarks.Comment: Accepted at CoLLAs 2023, 21 pages, 15 figure

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2211.15944

Last time updated on 30/12/2022