Search CORE

35 research outputs found

Multi-Agent Reinforcement Learning as a Rehearsal for Decentralized Planning

Author: Aras
Auer
Bernstein
Bikramjit Banerjee
Busoniu
Farahmand
Landon Kraemer
Mataric
Oliehoek
Price
Sutton
Publication venue: The Aquila Digital Community
Publication date: 19/05/2016
Field of study

Decentralized partially observable Markov decision processes (Dec-POMDPs) are a powerful tool for modeling multi-agent planning and decision-making under uncertainty. Prevalent Dec-POMDP solution techniques require centralized computation given full knowledge of the underlying model. Multi-agent reinforcement learning (MARL) based approaches have been recently proposed for distributed solution of Dec-POMDPs without full prior knowledge of the model, but these methods assume that conditions during learning and policy execution are identical. In some practical scenarios this may not be the case. We propose a novel MARL approach in which agents are allowed to rehearse with information that will not be available during policy execution. The key is for the agents to learn policies that do not explicitly rely on these rehearsal features. We also establish a weak convergence result for our algorithm, RLaR, demonstrating that RLaR converges in probability when certain conditions are met. We show experimentally that incorporating rehearsal features can enhance the learning rate compared to non-rehearsal-based learners, and demonstrate fast, (near) optimal performance on many existing benchmark Dec-POMDP problems. We also compare RLaR against an existing approximate Dec-POMDP solver which, like RLaR, does not assume a priori knowledge of the model. While RLaR׳s policy representation is not as scalable, we show that RLaR produces higher quality policies for most problems and horizons studied

Aquila Digital Community (University of Southern Mississippi, USM)

Crossref

Decentralization of Multiagent Policies by Learning What to Communicate

Author: Chen Steven W.
Kumar Vijay
Paulos James
Shishika Daigo
Publication venue
Publication date: 25/03/2019
Field of study

Effective communication is required for teams of robots to solve sophisticated collaborative tasks. In practice it is typical for both the encoding and semantics of communication to be manually defined by an expert; this is true regardless of whether the behaviors themselves are bespoke, optimization based, or learned. We present an agent architecture and training methodology using neural networks to learn task-oriented communication semantics based on the example of a communication-unaware expert policy. A perimeter defense game illustrates the system's ability to handle dynamically changing numbers of agents and its graceful degradation in performance as communication constraints are tightened or the expert's observability assumptions are broken.Comment: 7 page

arXiv.org e-Print Archive

Crossref

Learning to Communicate with Deep Multi-Agent Reinforcement Learning

Author: Assael Yannis M.
de Freitas Nando
Foerster Jakob N.
Whiteson Shimon
Publication venue
Publication date: 01/01/2016
Field of study

We consider the problem of multiple agents sensing and acting in environments with the goal of maximising their shared utility. In these environments, agents must learn communication protocols in order to share information that is needed to solve the tasks. By embracing deep neural networks, we are able to demonstrate end-to-end learning of protocols in complex environments inspired by communication riddles and multi-agent computer vision problems with partial observability. We propose two approaches for learning in these domains: Reinforced Inter-Agent Learning (RIAL) and Differentiable Inter-Agent Learning (DIAL). The former uses deep Q-learning, while the latter exploits the fact that, during learning, agents can backpropagate error derivatives through (noisy) communication channels. Hence, this approach uses centralised learning but decentralised execution. Our experiments introduce new environments for studying the learning of communication protocols and present a set of engineering innovations that are essential for success in these domains

arXiv.org e-Print Archive

Oxford University Research Archive

Distributed Deep Reinforcement Learning Resource Allocation Scheme For Industry 4.0 Device-To-Device Scenarios

Author: Adeogun Ramoni Ojekunle
Barco Raquel
Bruun Rasmus
de-la-Bandera Isabel
Morejon Santiago
Romero Jesus Burgueno
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2021
Field of study

VBN