1 research outputs found
Automaton Distillation: Neuro-Symbolic Transfer Learning for Deep Reinforcement Learning
Reinforcement learning (RL) is a powerful tool for finding optimal policies
in sequential decision processes. However, deep RL methods suffer from two
weaknesses: collecting the amount of agent experience required for practical RL
problems is prohibitively expensive, and the learned policies exhibit poor
generalization on tasks outside of the training distribution. To mitigate these
issues, we introduce automaton distillation, a form of neuro-symbolic transfer
learning in which Q-value estimates from a teacher are distilled into a
low-dimensional representation in the form of an automaton. We then propose two
methods for generating Q-value estimates: static transfer, which reasons over
an abstract Markov Decision Process constructed based on prior knowledge, and
dynamic transfer, where symbolic information is extracted from a teacher Deep
Q-Network (DQN). The resulting Q-value estimates from either method are used to
bootstrap learning in the target environment via a modified DQN loss function.
We list several failure modes of existing automaton-based transfer methods and
demonstrate that both static and dynamic automaton distillation decrease the
time required to find optimal policies for various decision tasks