1,699 research outputs found
Multiagent Deep Reinforcement Learning: Challenges and Directions Towards Human-Like Approaches
This paper surveys the field of multiagent deep reinforcement learning. The
combination of deep neural networks with reinforcement learning has gained
increased traction in recent years and is slowly shifting the focus from
single-agent to multiagent environments. Dealing with multiple agents is
inherently more complex as (a) the future rewards depend on the joint actions
of multiple players and (b) the computational complexity of functions
increases. We present the most common multiagent problem representations and
their main challenges, and identify five research areas that address one or
more of these challenges: centralised training and decentralised execution,
opponent modelling, communication, efficient coordination, and reward shaping.
We find that many computational studies rely on unrealistic assumptions or are
not generalisable to other settings; they struggle to overcome the curse of
dimensionality or nonstationarity. Approaches from psychology and sociology
capture promising relevant behaviours such as communication and coordination.
We suggest that, for multiagent reinforcement learning to be successful, future
research addresses these challenges with an interdisciplinary approach to open
up new possibilities for more human-oriented solutions in multiagent
reinforcement learning.Comment: 37 pages, 6 figure
Learning in Cooperative Multiagent Systems Using Cognitive and Machine Models
Developing effective Multi-Agent Systems (MAS) is critical for many
applications requiring collaboration and coordination with humans. Despite the
rapid advance of Multi-Agent Deep Reinforcement Learning (MADRL) in cooperative
MAS, one major challenge is the simultaneous learning and interaction of
independent agents in dynamic environments in the presence of stochastic
rewards. State-of-the-art MADRL models struggle to perform well in Coordinated
Multi-agent Object Transportation Problems (CMOTPs), wherein agents must
coordinate with each other and learn from stochastic rewards. In contrast,
humans often learn rapidly to adapt to nonstationary environments that require
coordination among people. In this paper, motivated by the demonstrated ability
of cognitive models based on Instance-Based Learning Theory (IBLT) to capture
human decisions in many dynamic decision making tasks, we propose three
variants of Multi-Agent IBL models (MAIBL). The idea of these MAIBL algorithms
is to combine the cognitive mechanisms of IBLT and the techniques of MADRL
models to deal with coordination MAS in stochastic environments from the
perspective of independent learners. We demonstrate that the MAIBL models
exhibit faster learning and achieve better coordination in a dynamic CMOTP task
with various settings of stochastic rewards compared to current MADRL models.
We discuss the benefits of integrating cognitive insights into MADRL models.Comment: 22 pages, 5 figures, 2 table
Independent reinforcement learners in cooperative Markov games: a survey regarding coordination problems.
International audienceIn the framework of fully cooperative multi-agent systems, independent (non-communicative) agents that learn by reinforcement must overcome several difficulties to manage to coordinate. This paper identifies several challenges responsible for the non-coordination of independent agents: Pareto-selection, nonstationarity, stochasticity, alter-exploration and shadowed equilibria. A selection of multi-agent domains is classified according to those challenges: matrix games, Boutilier's coordination game, predators pursuit domains and a special multi-state game. Moreover the performance of a range of algorithms for independent reinforcement learners is evaluated empirically. Those algorithms are Q-learning variants: decentralized Q-learning, distributed Q-learning, hysteretic Q-learning, recursive FMQ and WoLF PHC. An overview of the learning algorithms' strengths and weaknesses against each challenge concludes the paper and can serve as a basis for choosing the appropriate algorithm for a new domain. Furthermore, the distilled challenges may assist in the design of new learning algorithms that overcome these problems and achieve higher performance in multi-agent applications
Coordination of independent learners in cooperative Markov games.
In the framework of fully cooperative multi-agent systems, independent agents learning by reinforcement must overcome several difficulties as the coordination or the impact of exploration. The study of these issues allows first to synthesize the characteristics of existing reinforcement learning decentralized methods for independent learners in cooperative Markov games. Then, given the difficulties encountered by these approaches, we focus on two main skills: optimistic agents, which manage the coordination in deterministic environments, and the detection of the stochasticity of a game. Indeed, the key difficulty in stochastic environment is to distinguish between various causes of noise. The SOoN algorithm is so introduced, standing for “Swing between Optimistic or Neutral”, in which independent learners can adapt automatically to the environment stochasticity. Empirical results on various cooperative Markov games notably show that SOoN overcomes the main factors of non-coordination and is robust face to the exploration of other agents
- …