146 research outputs found
Iterative Policy-Space Expansion in Reinforcement Learning
Humans and animals solve a difficult problem much more easily when they are
presented with a sequence of problems that starts simple and slowly increases
in difficulty. We explore this idea in the context of reinforcement learning.
Rather than providing the agent with an externally provided curriculum of
progressively more difficult tasks, the agent solves a single task utilizing a
decreasingly constrained policy space. The algorithm we propose first learns to
categorize features into positive and negative before gradually learning a more
refined policy. Experimental results in Tetris demonstrate superior learning
rate of our approach when compared to existing algorithms.Comment: Workshop on Biological and Artificial Reinforcement Learning at the
33rd Conference on Neural Information Processing Systems (NeurIPS 2019),
Vancouver, Canad
Using Relative Novelty to Identify Useful Temporal Abstractions in Reinforcement Learning
We present a new method for automatically creating useful temporal abstractions in reinforcement learning. We argue that states that allow the agent to transition to a different region of the state space are useful subgoals, and propose a method for identifying them using the concept of relative novelty. When such a state is identified, a temporallyextended activity (e.g., an option) is generated that takes the agent efficiently to this state. We illustrate the utility of the method in a number of tasks
Betweenness Centrality as a Basis for Forming Skills
We show that betweenness centrality, a graph-theoretic measure widely used in social network analysis, provides a sound basis for autonomously forming useful high-level behaviors, or skills, from available primitives— the smallest behavioral units available to an autonomous agent
Creating Multi-Level Skill Hierarchies in Reinforcement Learning
What is a useful skill hierarchy for an autonomous agent? We propose an answer based on the graphical structure of an agent's interaction with its environment. Our approach uses hierarchical graph partitioning to expose the structure of the graph at varying timescales, producing a skill hierarchy with multiple levels of abstraction. At each level of the hierarchy, skills move the agent between regions of the state space that are well connected within themselves but weakly connected to each other. We illustrate the utility of the proposed skill hierarchy in a wide variety of domains in the context of reinforcement learning
Explaining Reinforcement Learning with Shapley Values
For reinforcement learning systems to be widely adopted, their users must understand and trust them. We present a theoretical analysis of explaining reinforcement learning using Shapley values, following a principled approach from game theory for identifying the contribution of individual players to the outcome of a cooperative game. We call this general framework Shapley Values for Explaining Reinforcement Learning (SVERL). Our analysis exposes the limitations of earlier uses of Shapley values in reinforcement learning. We then develop an approach that uses Shapley values to explain agent performance. In a variety of domains, SVERL produces meaningful explanations that match and supplement human intuition
Explaining Reinforcement Learning with Shapley Values
For reinforcement learning systems to be widely adopted, their users must
understand and trust them. We present a theoretical analysis of explaining
reinforcement learning using Shapley values, following a principled approach
from game theory for identifying the contribution of individual players to the
outcome of a cooperative game. We call this general framework Shapley Values
for Explaining Reinforcement Learning (SVERL). Our analysis exposes the
limitations of earlier uses of Shapley values in reinforcement learning. We
then develop an approach that uses Shapley values to explain agent performance.
In a variety of domains, SVERL produces meaningful explanations that match and
supplement human intuition.Comment: 12 pages, 9 figures. Accepted at ICML 202
- …