Search CORE

1,167 research outputs found

Lower Bounds on Implementing Robust and Resilient Mediators

Author: Abraham Ittai
Dolev Danny
Halpern Joseph Y.
Publication venue
Publication date: 01/01/2007
Field of study

We consider games that have (k,t)-robust equilibria when played with a mediator, where an equilibrium is (k,t)-robust if it tolerates deviations by coalitions of size up to k and deviations by up to

t

players with unknown utilities. We prove lower bounds that match upper bounds on the ability to implement such mediators using cheap talk (that is, just allowing communication among the players). The bounds depend on (a) the relationship between k, t, and n, the total number of players in the system; (b) whether players know the exact utilities of other players; (c) whether there are broadcast channels or just point-to-point channels; (d) whether cryptography is available; and (e) whether the game has a

k+t)-punishment strategy; that is, a strategy that, if used by all but at most

k+t$ players, guarantees that every player gets a worse outcome than they do with the equilibrium strategy

arXiv.org e-Print Archive

CiteSeerX

Byzantine Robust Cooperative Multi-Agent Reinforcement Learning as a Bayesian Game

Author: Guo Jun
Li Simin
Liu Aishan
Liu Xianglong
Wang Jiakai
Xiu Jingqiao
Xu Ruixiao
Yang Yaodong
Yu Xin
Publication venue
Publication date: 15/10/2023
Field of study

In this study, we explore the robustness of cooperative multi-agent reinforcement learning (c-MARL) against Byzantine failures, where any agent can enact arbitrary, worst-case actions due to malfunction or adversarial attack. To address the uncertainty that any agent can be adversarial, we propose a Bayesian Adversarial Robust Dec-POMDP (BARDec-POMDP) framework, which views Byzantine adversaries as nature-dictated types, represented by a separate transition. This allows agents to learn policies grounded on their posterior beliefs about the type of other agents, fostering collaboration with identified allies and minimizing vulnerability to adversarial manipulation. We define the optimal solution to the BARDec-POMDP as an ex post robust Bayesian Markov perfect equilibrium, which we proof to exist and weakly dominates the equilibrium of previous robust MARL approaches. To realize this equilibrium, we put forward a two-timescale actor-critic algorithm with almost sure convergence under specific conditions. Experimentation on matrix games, level-based foraging and StarCraft II indicate that, even under worst-case perturbations, our method successfully acquires intricate micromanagement skills and adaptively aligns with allies, demonstrating resilience against non-oblivious adversaries, random allies, observation-based attacks, and transfer-based attacks

arXiv.org e-Print Archive