Attention-Based Recurrence for Multi-Agent Reinforcement Learning under
  Stochastic Partial Observability

Altmann, Philipp; Gabor, Thomas; Kölle, Michael; Linnhoff-Popien, Claudia; Nüßlein, Jonas; Phan, Thomy; Ritz, Fabian; Zorn, Maximilian

Attention-Based Recurrence for Multi-Agent Reinforcement Learning under Stochastic Partial Observability

Authors: Philipp Altmann
Thomas Gabor
Michael Kölle
Claudia Linnhoff-Popien
Jonas Nüßlein
Thomy Phan
Fabian Ritz
Maximilian Zorn
Publication date: 29 May 2023
Publisher

Abstract

Stochastic partial observability poses a major challenge for decentralized coordination in multi-agent reinforcement learning but is largely neglected in state-of-the-art research due to a strong focus on state-based centralized training for decentralized execution (CTDE) and benchmarks that lack sufficient stochasticity like StarCraft Multi-Agent Challenge (SMAC). In this paper, we propose Attention-based Embeddings of Recurrence In multi-Agent Learning (AERIAL) to approximate value functions under stochastic partial observability. AERIAL replaces the true state with a learned representation of multi-agent recurrence, considering more accurate information about decentralized agent decisions than state-based CTDE. We then introduce MessySMAC, a modified version of SMAC with stochastic observations and higher variance in initial states, to provide a more general and configurable benchmark regarding stochastic partial observability. We evaluate AERIAL in Dec-Tiger as well as in a variety of SMAC and MessySMAC maps, and compare the results with state-based CTDE. Furthermore, we evaluate the robustness of AERIAL and state-based CTDE against various stochasticity configurations in MessySMAC.Comment: Accepted at ICML 202

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2301.01649

Last time updated on 26/01/2023