31,874 research outputs found
Multi-agent Deep Covering Option Discovery
The use of options can greatly accelerate exploration in reinforcement
learning, especially when only sparse reward signals are available. While
option discovery methods have been proposed for individual agents, in
multi-agent reinforcement learning settings, discovering collaborative options
that can coordinate the behavior of multiple agents and encourage them to visit
the under-explored regions of their joint state space has not been considered.
In this case, we propose Multi-agent Deep Covering Option Discovery, which
constructs the multi-agent options through minimizing the expected cover time
of the multiple agents' joint state space. Also, we propose a novel framework
to adopt the multi-agent options in the MARL process. In practice, a
multi-agent task can usually be divided into some sub-tasks, each of which can
be completed by a sub-group of the agents. Therefore, our algorithm framework
first leverages an attention mechanism to find collaborative agent sub-groups
that would benefit most from coordinated actions. Then, a hierarchical
algorithm, namely HA-MSAC, is developed to learn the multi-agent options for
each sub-group to complete their sub-tasks first, and then to integrate them
through a high-level policy as the solution of the whole task. This
hierarchical option construction allows our framework to strike a balance
between scalability and effective collaboration among the agents. The
evaluation based on multi-agent collaborative tasks shows that the proposed
algorithm can effectively capture the agent interactions with the attention
mechanism, successfully identify multi-agent options, and significantly
outperforms prior works using single-agent options or no options, in terms of
both faster exploration and higher task rewards.Comment: This paper was presented in part at the ICML Reinforcement Learning
for Real Life Workshop, July 202
Learning to Communicate with Deep Multi-Agent Reinforcement Learning
We consider the problem of multiple agents sensing and acting in environments
with the goal of maximising their shared utility. In these environments, agents
must learn communication protocols in order to share information that is needed
to solve the tasks. By embracing deep neural networks, we are able to
demonstrate end-to-end learning of protocols in complex environments inspired
by communication riddles and multi-agent computer vision problems with partial
observability. We propose two approaches for learning in these domains:
Reinforced Inter-Agent Learning (RIAL) and Differentiable Inter-Agent Learning
(DIAL). The former uses deep Q-learning, while the latter exploits the fact
that, during learning, agents can backpropagate error derivatives through
(noisy) communication channels. Hence, this approach uses centralised learning
but decentralised execution. Our experiments introduce new environments for
studying the learning of communication protocols and present a set of
engineering innovations that are essential for success in these domains
- …