84,029 research outputs found
Correcting Experience Replay for Multi-Agent Communication
We consider the problem of learning to communicate using multi-agent
reinforcement learning (MARL). A common approach is to learn off-policy, using
data sampled from a replay buffer. However, messages received in the past may
not accurately reflect the current communication policy of each agent, and this
complicates learning. We therefore introduce a 'communication correction' which
accounts for the non-stationarity of observed communication induced by
multi-agent learning. It works by relabelling the received message to make it
likely under the communicator's current policy, and thus be a better reflection
of the receiver's current environment. To account for cases in which agents are
both senders and receivers, we introduce an ordered relabelling scheme. Our
correction is computationally efficient and can be integrated with a range of
off-policy algorithms. It substantially improves the ability of communicating
MARL systems to learn across a variety of cooperative and competitive tasks
CORE: Cooperative Reconstruction for Multi-Agent Perception
This paper presents CORE, a conceptually simple, effective and
communication-efficient model for multi-agent cooperative perception. It
addresses the task from a novel perspective of cooperative reconstruction,
based on two key insights: 1) cooperating agents together provide a more
holistic observation of the environment, and 2) the holistic observation can
serve as valuable supervision to explicitly guide the model learning how to
reconstruct the ideal observation based on collaboration. CORE instantiates the
idea with three major components: a compressor for each agent to create more
compact feature representation for efficient broadcasting, a lightweight
attentive collaboration component for cross-agent message aggregation, and a
reconstruction module to reconstruct the observation based on aggregated
feature representations. This learning-to-reconstruct idea is task-agnostic,
and offers clear and reasonable supervision to inspire more effective
collaboration, eventually promoting perception tasks. We validate CORE on
OPV2V, a large-scale multi-agent percetion dataset, in two tasks, i.e., 3D
object detection and semantic segmentation. Results demonstrate that the model
achieves state-of-the-art performance on both tasks, and is more
communication-efficient.Comment: Accepted to ICCV 2023; Code: https://github.com/zllxot/COR
- …