4 research outputs found
Targeted Adversarial Attacks on Deep Reinforcement Learning Policies via Model Checking
Deep Reinforcement Learning (RL) agents are susceptible to adversarial noise
in their observations that can mislead their policies and decrease their
performance. However, an adversary may be interested not only in decreasing the
reward, but also in modifying specific temporal logic properties of the policy.
This paper presents a metric that measures the exact impact of adversarial
attacks against such properties. We use this metric to craft optimal
adversarial attacks. Furthermore, we introduce a model checking method that
allows us to verify the robustness of RL policies against adversarial attacks.
Our empirical analysis confirms (1) the quality of our metric to craft
adversarial attacks against temporal logic properties, and (2) that we are able
to concisely assess a system's robustness against attacks.Comment: ICAART 2023 Paper (Technical Report