In many multi-agent and high-dimensional robotic tasks, the controller can be
designed in either a centralized or decentralized way. Correspondingly, it is
possible to use either single-agent reinforcement learning (SARL) or
multi-agent reinforcement learning (MARL) methods to learn such controllers.
However, the relationship between these two paradigms remains under-studied in
the literature. This work explores research questions in terms of robustness
and performance of SARL and MARL approaches to the same task, in order to gain
insight into the most suitable methods. We start by analytically showing the
equivalence between these two paradigms under the full-state observation
assumption. Then, we identify a broad subclass of \textit{Dec-POMDP} tasks
where the agents are weakly or partially interacting. In these tasks, we show
that partial observations of each agent are sufficient for near-optimal
decision-making. Furthermore, we propose to exploit such partially observable
MARL to improve the robustness of robots when joint or agent failures occur.
Our experiments on both simulated multi-agent tasks and a real robot task with
a mobile manipulator validate the presented insights and the effectiveness of
the proposed robust robot learning method via partially observable MARL.Comment: 8 pages, 8 figure