Search CORE

1,216 research outputs found

Counterfactual Multi-Agent Policy Gradients

Author: Afouras Triantafyllos
Farquhar Gregory
Foerster Jakob
Nardelli Nantas
Whiteson Shimon
Publication venue
Publication date: 14/12/2017
Field of study

Cooperative multi-agent systems can be naturally used to model many real world problems, such as network packet routing and the coordination of autonomous vehicles. There is a great need for new reinforcement learning methods that can efficiently learn decentralised policies for such systems. To this end, we propose a new multi-agent actor-critic method called counterfactual multi-agent (COMA) policy gradients. COMA uses a centralised critic to estimate the Q-function and decentralised actors to optimise the agents' policies. In addition, to address the challenges of multi-agent credit assignment, it uses a counterfactual baseline that marginalises out a single agent's action, while keeping the other agents' actions fixed. COMA also uses a critic representation that allows the counterfactual baseline to be computed efficiently in a single forward pass. We evaluate COMA in the testbed of StarCraft unit micromanagement, using a decentralised variant with significant partial observability. COMA significantly improves average performance over other multi-agent actor-critic methods in this setting, and the best performing agents are competitive with state-of-the-art centralised controllers that get access to the full state

arXiv.org e-Print Archive

Oxford University Research Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

MULTIAGENT LEARNING FOR BLACK BOX SYSTEM REWARD FUNCTIONS

Author: Bagnell J. A.
Bilimoria K. D.
Dietterich T. G.
Jefferies P.
Jennings N. R.
McGlohon M.
Parkes D.
Stone P.
Tuyls K.
Whiteson S.
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date
Field of study

Crossref

Reinforcement learning for zone based multiagent pathfinding under uncertainty

Author: GUPTA Tarun
KUMAR Akshat
LING Jiajing
Publication venue: AAAI Press
Publication date: 01/06/2020
Field of study

Ministry of Education, Singapore under its Academic Research Funding Tier

Institutional Knowledge at Singapore Management University

Association for the Advancement of Artificial Intelligence: AAAI Publications

Sharing diverse information gets driver agents to learn faster : an application in en route trip building

Author: Bazzan Ana Lucia Cetertich
Santos Guilherme Dytz dos
Publication venue
Publication date: 01/01/2021
Field of study

With the increase in the use of private transportation, developing more efficient ways to distribute routes in a traffic network has become more and more important. Several attempts to address this issue have already been proposed, either by using a central authority to assign routes to the vehicles, or by means of a learning process where drivers select their best routes based on their previous experiences. The present work addresses a way to connect reinforcement learning to new technologies such as car-to-infrastructure communication in order to augment the drivers knowledge in an attempt to accelerate the learning process. Our method was compared to both a classical, iterative approach, as well as to standard reinforcement learning without communication. Results show that our method outperforms both of them. Further, we have performed robustness tests, by allowing messages to be lost, and by reducing the storage capacity of the communication devices. We were able to show that our method is not only tolerant to information loss, but also points out to improved performance when not all agents get the same information. Hence, we stress the fact that, before deploying communication in urban scenarios, it is necessary to take into consideration that the quality and diversity of information shared are key aspects

Lume 5.8