Search CORE

10,626 research outputs found

Decentralization of Multiagent Policies by Learning What to Communicate

Author: Chen Steven W.
Kumar Vijay
Paulos James
Shishika Daigo
Publication venue
Publication date: 25/03/2019
Field of study

Effective communication is required for teams of robots to solve sophisticated collaborative tasks. In practice it is typical for both the encoding and semantics of communication to be manually defined by an expert; this is true regardless of whether the behaviors themselves are bespoke, optimization based, or learned. We present an agent architecture and training methodology using neural networks to learn task-oriented communication semantics based on the example of a communication-unaware expert policy. A perimeter defense game illustrates the system's ability to handle dynamically changing numbers of agents and its graceful degradation in performance as communication constraints are tightened or the expert's observability assumptions are broken.Comment: 7 page

arXiv.org e-Print Archive

Crossref

Factorized Q-Learning for Large-Scale Multi-Agent Systems

Author: Claus Caroline
Foerster Jakob N.
HolmesParker Chris
Jelle
Lample Guillaume
Littman Michael L.
Lowe Ryan
Tesauro Gerald
van Hasselt Hado
van Hasselt Hado
Wang Ziyu
Watkins Christopher J. C. H.
Yang Yaodong
Zheng Lianmin
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 11/10/2019
Field of study

Deep Q-learning has achieved significant success in single-agent decision making tasks. However, it is challenging to extend Q-learning to large-scale multi-agent scenarios, due to the explosion of action space resulting from the complex dynamics between the environment and the agents. In this paper, we propose to make the computation of multi-agent Q-learning tractable by treating the Q-function (w.r.t. state and joint-action) as a high-order high-dimensional tensor and then approximate it with factorized pairwise interactions. Furthermore, we utilize a composite deep neural network architecture for computing the factorized Q-function, share the model parameters among all the agents within the same group, and estimate the agents' optimal joint actions through a coordinate descent type algorithm. All these simplifications greatly reduce the model complexity and accelerate the learning process. Extensive experiments on two different multi-agent problems demonstrate the performance gain of our proposed approach in comparison with strong baselines, particularly when there are a large number of agents.Comment: 7 pages, 5 figures, DAI 201

arXiv.org e-Print Archive

Crossref

Coordinated Multi-Agent Imitation Learning

Author: Carr Peter
Le Hoang M.
Lucey Patrick
Yue Yisong
Publication venue
Publication date: 01/08/2017
Field of study

We study the problem of imitation learning from demonstrations of multiple coordinating agents. One key challenge in this setting is that learning a good model of coordination can be difficult, since coordination is often implicit in the demonstrations and must be inferred as a latent variable. We propose a joint approach that simultaneously learns a latent coordination model along with the individual policies. In particular, our method integrates unsupervised structure learning with conventional imitation learning. We illustrate the power of our approach on a difficult problem of learning multiple policies for fine-grained behavior modeling in team sports, where different players occupy different roles in the coordinated team strategy. We show that having a coordination model to infer the roles of players yields substantially improved imitation loss compared to conventional baselines.Comment: International Conference on Machine Learning 201

arXiv.org e-Print Archive

Caltech Authors