412 research outputs found
Towards Better Interpretability in Deep Q-Networks
Deep reinforcement learning techniques have demonstrated superior performance
in a wide variety of environments. As improvements in training algorithms
continue at a brisk pace, theoretical or empirical studies on understanding
what these networks seem to learn, are far behind. In this paper we propose an
interpretable neural network architecture for Q-learning which provides a
global explanation of the model's behavior using key-value memories, attention
and reconstructible embeddings. With a directed exploration strategy, our model
can reach training rewards comparable to the state-of-the-art deep Q-learning
models. However, results suggest that the features extracted by the neural
network are extremely shallow and subsequent testing using out-of-sample
examples shows that the agent can easily overfit to trajectories seen during
training.Comment: Accepted at AAAI-19; (16 pages, 18 figures
SDRL: Interpretable and Data-efficient Deep Reinforcement Learning Leveraging Symbolic Planning
Deep reinforcement learning (DRL) has gained great success by learning
directly from high-dimensional sensory inputs, yet is notorious for the lack of
interpretability. Interpretability of the subtasks is critical in hierarchical
decision-making as it increases the transparency of black-box-style DRL
approach and helps the RL practitioners to understand the high-level behavior
of the system better. In this paper, we introduce symbolic planning into DRL
and propose a framework of Symbolic Deep Reinforcement Learning (SDRL) that can
handle both high-dimensional sensory inputs and symbolic planning. The
task-level interpretability is enabled by relating symbolic actions to
options.This framework features a planner -- controller -- meta-controller
architecture, which takes charge of subtask scheduling, data-driven subtask
learning, and subtask evaluation, respectively. The three components
cross-fertilize each other and eventually converge to an optimal symbolic plan
along with the learned subtasks, bringing together the advantages of long-term
planning capability with symbolic knowledge and end-to-end reinforcement
learning directly from a high-dimensional sensory input. Experimental results
validate the interpretability of subtasks, along with improved data efficiency
compared with state-of-the-art approaches
Neurogenetic Programming Framework for Explainable Reinforcement Learning
Automatic programming, the task of generating computer programs compliant
with a specification without a human developer, is usually tackled either via
genetic programming methods based on mutation and recombination of programs, or
via neural language models. We propose a novel method that combines both
approaches using a concept of a virtual neuro-genetic programmer: using
evolutionary methods as an alternative to gradient descent for neural network
training}, or scrum team. We demonstrate its ability to provide performant and
explainable solutions for various OpenAI Gym tasks, as well as inject expert
knowledge into the otherwise data-driven search for solutions.Comment: Source code is available at https://github.com/vadim0x60/cib
BF++: a language for general-purpose program synthesis
Most state of the art decision systems based on Reinforcement Learning (RL)
are data-driven black-box neural models, where it is often difficult to
incorporate expert knowledge into the models or let experts review and validate
the learned decision mechanisms. Knowledge-insertion and model review are
important requirements in many applications involving human health and safety.
One way to bridge the gap between data and knowledge driven systems is program
synthesis: replacing a neural network that outputs decisions with a symbolic
program generated by a neural network or by means of genetic programming. We
propose a new programming language, BF++, designed specifically for automatic
programming of agents in a Partially Observable Markov Decision Process (POMDP)
setting and apply neural program synthesis to solve standard OpenAI Gym
benchmarks.Comment: 8+2 pages (paper+references
- …