Search CORE

2 research outputs found

Improved Sample Complexity for Incremental Autonomous Exploration in MDPs

Author: Lazaric Alessandro
Pirotta Matteo
Tarbouriech Jean
Valko Michal
Publication venue
Publication date: 01/01/2020
Field of study

We investigate the exploration of an unknown environment when no reward function is provided. Building on the incremental exploration setting introduced by Lim and Auer [1], we define the objective of learning the set of

\epsilon

-optimal goal-conditioned policies attaining all states that are incrementally reachable within

L

steps (in expectation) from a reference state

s_0

. In this paper, we introduce a novel model-based approach that interleaves discovering new states from

s_0

and improving the accuracy of a model estimate that is used to compute goal-conditioned policies to reach newly discovered states. The resulting algorithm, DisCo, achieves a sample complexity scaling as