DeepSynth: Automata synthesis for automatic task segmentation in deep reinforcement learning

Abate, A; Hasanbeig, M; Kroening, D; Melham, TF; Yogananda Jeppu, N

DeepSynth: Automata synthesis for automatic task segmentation in deep reinforcement learning

Authors: A Abate
M Hasanbeig
D Kroening
TF Melham
N Yogananda Jeppu
Publication date: 1 January 2021
Publisher: 'Association for the Advancement of Artificial Intelligence (AAAI)'

Abstract

This paper proposes DeepSynth, a method for effective training of deep Reinforcement Learning (RL) agents when the reward is sparse and non-Markovian, but at the same time progress towards the reward requires achieving an unknown sequence of high-level objectives. Our method employs a novel algorithm for synthesis of compact automata to uncover this sequential structure automatically. We synthesise a humaninterpretable automaton from trace data collected by exploring the environment. The state space of the environment is then enriched with the synthesised automaton so that the generation of a control policy by deep RL is guided by the discovered structure encoded in the automaton. The proposed approach is able to cope with both high-dimensional, low-level features and unknown sparse non-Markovian rewards. We have evaluated DeepSynth’s performance in a set of experiments that includes the Atari game Montezuma’s Revenge. Compared to existing approaches, we obtain a reduction of two orders of magnitude in the number of iterations required for policy synthesis, and also a significant improvement in scalability

Similar works

Full text

Available Versions

Supporting member

Oxford University Research Archive

oai:ora.ox.ac.uk:uuid:1a0b6c36...

Last time updated on 14/03/2021