Search CORE

135 research outputs found

Diverse Exploration via InfoMax Options

Author: Kanagawa Yuji
Kaneko Tomoyuki
Publication venue
Publication date: 06/10/2020
Field of study

In this paper, we study the problem of autonomously discovering temporally abstracted actions, or options, for exploration in reinforcement learning. For learning diverse options suitable for exploration, we introduce the infomax termination objective defined as the mutual information between options and their corresponding state transitions. We derive a scalable optimization scheme for maximizing this objective via the termination condition of options, yielding the InfoMax Option Critic (IMOC) algorithm. Through illustrative experiments, we empirically show that IMOC learns diverse options and utilizes them for exploration. Moreover, we show that IMOC scales well to continuous control tasks.Comment: Preprint. Under revie

arXiv.org e-Print Archive