Search CORE

481 research outputs found

Multiagent Bidirectionally-Coordinated Nets: Emergence of Human-level Coordination in Learning to Play StarCraft Combat Games

Author: Long Haitao
Peng Peng
Tang Zhenkun
Wang Jun
Wen Ying
Yang Yaodong
Yuan Quan
Publication venue
Publication date: 29/03/2017
Field of study

Many artificial intelligence (AI) applications often require multiple intelligent agents to work in a collaborative effort. Efficient learning for intra-agent communication and coordination is an indispensable step towards general AI. In this paper, we take StarCraft combat game as a case study, where the task is to coordinate multiple agents as a team to defeat their enemies. To maintain a scalable yet effective communication protocol, we introduce a Multiagent Bidirectionally-Coordinated Network (BiCNet ['bIknet]) with a vectorised extension of actor-critic formulation. We show that BiCNet can handle different types of combats with arbitrary numbers of AI agents for both sides. Our analysis demonstrates that without any supervisions such as human demonstrations or labelled data, BiCNet could learn various types of advanced coordination strategies that have been commonly used by experienced game players. In our experiments, we evaluate our approach against multiple baselines under different scenarios; it shows state-of-the-art performance, and possesses potential values for large-scale real-world applications.Comment: 10 pages, 10 figures. Previously as title: "Multiagent Bidirectionally-Coordinated Nets for Learning to Play StarCraft Combat Games", Mar 201

arXiv.org e-Print Archive

UCL Discovery

CLEANing the Reward: Counterfactual Actions to Remove Exploratory Action Noise in Multiagent Learning

Author: Agogino Adrian
HolmesParker Chris
Taylor Mathew E.
Tumer Kagan
Publication venue
Publication date
Field of study

Learning in multiagent systems can be slow because agents must learn both how to behave in a complex environment and how to account for the actions of other agents. The inability of an agent to distinguish between the true environmental dynamics and those caused by the stochastic exploratory actions of other agents creates noise in each agent's reward signal. This learning noise can have unforeseen and often undesirable effects on the resultant system performance. We define such noise as exploratory action noise, demonstrate the critical impact it can have on the learning process in multiagent settings, and introduce a reward structure to effectively remove such noise from each agent's reward signal. In particular, we introduce Coordinated Learning without Exploratory Action Noise (CLEAN) rewards and empirically demonstrate their benefit

NASA Technical Reports Server

Arena: A General Evaluation Platform and Building Toolkit for Multi-Agent Intelligence

Author: Aryan Abi
Ding Zihan
Lukasiewicz Thomas
Song Yuhang
Wang Jianyi
Wojcicki Andrzej
Wu Lianlong
Xu Mai
Xu Zhenghua
Publication venue
Publication date: 27/11/2019
Field of study

Learning agents that are not only capable of taking tests, but also innovating is becoming a hot topic in AI. One of the most promising paths towards this vision is multi-agent learning, where agents act as the environment for each other, and improving each agent means proposing new problems for others. However, existing evaluation platforms are either not compatible with multi-agent settings, or limited to a specific game. That is, there is not yet a general evaluation platform for research on multi-agent intelligence. To this end, we introduce Arena, a general evaluation platform for multi-agent intelligence with 35 games of diverse logics and representations. Furthermore, multi-agent intelligence is still at the stage where many problems remain unexplored. Therefore, we provide a building toolkit for researchers to easily invent and build novel multi-agent problems from the provided game set based on a GUI-configurable social tree and five basic multi-agent reward schemes. Finally, we provide Python implementations of five state-of-the-art deep multi-agent reinforcement learning baselines. Along with the baseline implementations, we release a set of 100 best agents/teams that we can train with different training schemes for each game, as the base for evaluating agents with population performance. As such, the research community can perform comparisons under a stable and uniform standard. All the implementations and accompanied tutorials have been open-sourced for the community at https://sites.google.com/view/arena-unity/

arXiv.org e-Print Archive

Oxford University Research Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Resource Abstraction for Reinforcement Learning in Multiagent Congestion Problems

Author: Devlin Sam
Kudenko Daniel
Malialis Kleanthis
Publication venue
Publication date: 09/05/2016
Field of study

White Rose Research Online