Due to the trial-and-error nature, it is typically challenging to apply RL
algorithms to safety-critical real-world applications, such as autonomous
driving, human-robot interaction, robot manipulation, etc, where such errors
are not tolerable. Recently, safe RL (i.e. constrained RL) has emerged rapidly
in the literature, in which the agents explore the environment while satisfying
constraints. Due to the diversity of algorithms and tasks, it remains difficult
to compare existing safe RL algorithms. To fill that gap, we introduce GUARD, a
Generalized Unified SAfe Reinforcement Learning Development Benchmark. GUARD
has several advantages compared to existing benchmarks. First, GUARD is a
generalized benchmark with a wide variety of RL agents, tasks, and safety
constraint specifications. Second, GUARD comprehensively covers
state-of-the-art safe RL algorithms with self-contained implementations. Third,
GUARD is highly customizable in tasks and algorithms. We present a comparison
of state-of-the-art safe RL algorithms in various task settings using GUARD and
establish baselines that future work can build on