3 research outputs found
SafeLife 1.0: Exploring Side Effects in Complex Environments
We present SafeLife, a publicly available reinforcement learning environment
that tests the safety of reinforcement learning agents. It contains complex,
dynamic, tunable, procedurally generated levels with many opportunities for
unsafe behavior. Agents are graded both on their ability to maximize their
explicit reward and on their ability to operate safely without unnecessary side
effects. We train agents to maximize rewards using proximal policy optimization
and score them on a suite of benchmark levels. The resulting agents are
performant but not safe -- they tend to cause large side effects in their
environments -- but they form a baseline against which future safety research
can be measured.Comment: Updated version was presented at the AAAI SafeAI 2020 Workshop, but
now with updated contact info. Previously presented at the 2019 NeurIPS
Safety and Robustness in Decision Making Worksho
Safety Aware Reinforcement Learning (SARL)
As reinforcement learning agents become increasingly integrated into complex,
real-world environments, designing for safety becomes a critical consideration.
We specifically focus on researching scenarios where agents can cause undesired
side effects while executing a policy on a primary task. Since one can define
multiple tasks for a given environment dynamics, there are two important
challenges. First, we need to abstract the concept of safety that applies
broadly to that environment independent of the specific task being executed.
Second, we need a mechanism for the abstracted notion of safety to modulate the
actions of agents executing different policies to minimize their side-effects.
In this work, we propose Safety Aware Reinforcement Learning (SARL) - a
framework where a virtual safe agent modulates the actions of a main
reward-based agent to minimize side effects. The safe agent learns a
task-independent notion of safety for a given environment. The main agent is
then trained with a regularization loss given by the distance between the
native action probabilities of the two agents. Since the safe agent effectively
abstracts a task-independent notion of safety via its action probabilities, it
can be ported to modulate multiple policies solving different tasks within the
given environment without further training. We contrast this with solutions
that rely on task-specific regularization metrics and test our framework on the
SafeLife Suite, based on Conway's Game of Life, comprising a number of complex
tasks in dynamic environments. We show that our solution is able to match the
performance of solutions that rely on task-specific side-effect penalties on
both the primary and safety objectives while additionally providing the benefit
of generalizability and portability
Avoiding Side Effects in Complex Environments
Reward function specification can be difficult, even in simple environments.
Realistic environments contain millions of states. Rewarding the agent for
making a widget may be easy, but penalizing the multitude of possible negative
side effects is hard. In toy environments, Attainable Utility Preservation
(AUP) avoids side effects by penalizing shifts in the ability to achieve
randomly generated goals. We scale this approach to large, randomly generated
environments based on Conway's Game of Life. By preserving optimal value for a
single randomly generated reward function, AUP incurs modest overhead,
completes the specified task, and avoids side effects.Comment: 16 pages with appendice