Search CORE

7 research outputs found

Policy gradient with value function approximation for collective multiagent planning

Author: KUMAR Akshat
LAU Hoong Chuin
NGUYEN Duc Thien
Publication venue: NIPS Foundation
Publication date: 01/12/2017
Field of study

National Research Foundation (NRF) Singapore under Corp Lab @ University scheme; Fujitsu Lt

arXiv.org e-Print Archive

Institutional Knowledge at Singapore Management University

Apprendre à agir dans un Dec-POMDP

Author: Buffet Olivier
Dibangoye Jilles
Publication venue: HAL CCSD
Publication date: 07/06/2018
Field of study

We address a long-standing open problem of reinforcement learning in decentralized partiallyobservable Markov decision processes. Previous attempts focussed on different forms of generalized policyiteration, which at best led to local optima. In this paper, we restrict attention to plans, which are simplerto store and update than policies. We derive, under certain conditions, the first near-optimal cooperativemulti-agent reinforcement learning algorithm. To achieve significant scalability gains, we replace the greedymaximization by mixed-integer linear programming. Experiments show our approach can learn to actnear-optimally in many finite domains from the literature

INRIA a CCSD electronic archive server