Sensing and communication technologies have enhanced learning-based decision
making methodologies for multi-agent systems such as connected autonomous
vehicles (CAV). However, most existing safe reinforcement learning based
methods assume accurate state information. It remains challenging to achieve
safety requirement under state uncertainties for CAVs, considering the noisy
sensor measurements and the vulnerability of communication channels. In this
work, we propose a Robust Multi-Agent Proximal Policy Optimization with robust
Safety Shield (SR-MAPPO) for CAVs in various driving scenarios. Both robust
MARL algorithm and control barrier function (CBF)-based safety shield are used
in our approach to cope with the perturbed or uncertain state inputs. The
robust policy is trained with a worst-case Q function regularization module
that pursues higher lower-bounded reward in the former, whereas the latter,
i.e., the robust CBF safety shield accounts for CAVs' collision-free
constraints in complicated driving scenarios with even perturbed vehicle state
information. We validate the advantages of SR-MAPPO in robustness and safety
and compare it with baselines under different driving and state perturbation
scenarios in CARLA simulator. The SR-MAPPO policy is verified to maintain
higher safety rates and efficiency (reward) when threatened by both state
perturbations and unconnected vehicles' dangerous behaviors.Comment: 6 pages, 5 figure