Search CORE

1 research outputs found

Robust multi-agent Q-learning in cooperative games with adversaries

Author: Bloembergen D. (Daniel)
Kaisers M. (Michael)
Nisioti E. (Eleni)
Publication venue
Publication date: 08/02/2021
Field of study

We present RoM-Q 1, a new Q-learning-like algorithm for finding policies robust to attacks in multi-agent systems (MAS). We consider a novel type of attack, where a team of adversaries, aware of the optimal multi-agent Q-value function, performs a worst-case selection of both the agents to attack and the actions to perform. Our motivation lies in real-world MAS where vulnerabilities of particular agents emerge due to their characteristics and robust policies need to be learned without requiring the simulation of attacks during training. In our simulations, where we train policies using RoMQ, Q-learning and minimax-Q and derive corresponding adversarial attacks, we observe that policies learned using RoM-Q are more robust, as they accrue the highest rewards against all considered adversarial attacks

CWI's Institutional Repository