8,229 research outputs found
Multiagent Bidirectionally-Coordinated Nets: Emergence of Human-level Coordination in Learning to Play StarCraft Combat Games
Many artificial intelligence (AI) applications often require multiple
intelligent agents to work in a collaborative effort. Efficient learning for
intra-agent communication and coordination is an indispensable step towards
general AI. In this paper, we take StarCraft combat game as a case study, where
the task is to coordinate multiple agents as a team to defeat their enemies. To
maintain a scalable yet effective communication protocol, we introduce a
Multiagent Bidirectionally-Coordinated Network (BiCNet ['bIknet]) with a
vectorised extension of actor-critic formulation. We show that BiCNet can
handle different types of combats with arbitrary numbers of AI agents for both
sides. Our analysis demonstrates that without any supervisions such as human
demonstrations or labelled data, BiCNet could learn various types of advanced
coordination strategies that have been commonly used by experienced game
players. In our experiments, we evaluate our approach against multiple
baselines under different scenarios; it shows state-of-the-art performance, and
possesses potential values for large-scale real-world applications.Comment: 10 pages, 10 figures. Previously as title: "Multiagent
Bidirectionally-Coordinated Nets for Learning to Play StarCraft Combat
Games", Mar 201
Socially Aware Motion Planning with Deep Reinforcement Learning
For robotic vehicles to navigate safely and efficiently in pedestrian-rich
environments, it is important to model subtle human behaviors and navigation
rules (e.g., passing on the right). However, while instinctive to humans,
socially compliant navigation is still difficult to quantify due to the
stochasticity in people's behaviors. Existing works are mostly focused on using
feature-matching techniques to describe and imitate human paths, but often do
not generalize well since the feature values can vary from person to person,
and even run to run. This work notes that while it is challenging to directly
specify the details of what to do (precise mechanisms of human navigation), it
is straightforward to specify what not to do (violations of social norms).
Specifically, using deep reinforcement learning, this work develops a
time-efficient navigation policy that respects common social norms. The
proposed method is shown to enable fully autonomous navigation of a robotic
vehicle moving at human walking speed in an environment with many pedestrians.Comment: 8 page
- …