Search CORE

582 research outputs found

SMIX( $\lambda$ ): Enhancing Centralized Value Functions for Cooperative Multi-Agent Reinforcement Learning

Author: Tan Xiaoyang
Wang Yuhui
Wen Chao
Yao Xinghu
Publication venue
Publication date: 03/04/2020
Field of study

Learning a stable and generalizable centralized value function (CVF) is a crucial but challenging task in multi-agent reinforcement learning (MARL), as it has to deal with the issue that the joint action space increases exponentially with the number of agents in such scenarios. This paper proposes an approach, named SMIX(

{\lambda}

), to address the issue using an efficient off-policy centralized training method within a flexible learner search space. As importance sampling for such off-policy training is both computationally costly and numerically unstable, we proposed to use the

{\lambda}

-return as a proxy to compute the TD error. With this new loss function objective, we adopt a modified QMIX network structure as the base to train our model. By further connecting it with the

{Q(\lambda)}

approach from an unified expectation correction viewpoint, we show that the proposed SMIX(

{\lambda}

) is equivalent to

{Q(\lambda)}

and hence shares its convergence properties, while without being suffered from the aforementioned curse of dimensionality problem inherent in MARL. Experiments on the StarCraft Multi-Agent Challenge (SMAC) benchmark demonstrate that our approach not only outperforms several state-of-the-art MARL methods by a large margin, but also can be used as a general tool to improve the overall performance of other CTDE-type algorithms by enhancing their CVFs

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

The 1990 progress report and future plans

Author: Compton Michael
Friedland Peter
Zweben Monte
Publication venue
Publication date
Field of study

This document describes the progress and plans of the Artificial Intelligence Research Branch (RIA) at ARC in 1990. Activities span a range from basic scientific research to engineering development and to fielded NASA applications, particularly those applications that are enabled by basic research carried out at RIA. Work is conducted in-house and through collaborative partners in academia and industry. Our major focus is on a limited number of research themes with a dual commitment to technical excellence and proven applicability to NASA short, medium, and long-term problems. RIA acts as the Agency's lead organization for research aspects of artificial intelligence, working closely with a second research laboratory at JPL and AI applications groups at all NASA centers

NASA Technical Reports Server

Developing, Evaluating and Scaling Learning Agents in Multi-Agent Environments

Author: Anthony Thomas
Bachrach Yoram
Bhoopchand Avishkar
Bullard Kalesha
Connor Jerome
Dasagi Vibhavari
De Vylder Bart
Duenez-Guzman Edgar
Elie Romuald
Everett Richard
Gemp Ian
Hennes Daniel
Hughes Edward
Khan Mina
Lanctot Marc
Larson Kate
Lever Guy
Liu Siqi
Marris Luke
McKee Kevin R.
Muller Paul
Perolat Julien
Strub Florian
Tacchetti Andrea
Tarassov Eugene
Tuyls Karl
Wang Zhe
Publication venue
Publication date: 22/09/2022
Field of study

The Game Theory & Multi-Agent team at DeepMind studies several aspects of multi-agent learning ranging from computing approximations to fundamental concepts in game theory to simulating social dilemmas in rich spatial environments and training 3-d humanoids in difficult team coordination tasks. A signature aim of our group is to use the resources and expertise made available to us at DeepMind in deep reinforcement learning to explore multi-agent systems in complex environments and use these benchmarks to advance our understanding. Here, we summarise the recent work of our team and present a taxonomy that we feel highlights many important open challenges in multi-agent research.Comment: Published in AI Communications 202

arXiv.org e-Print Archive

On Transforming Reinforcement Learning by Transformer: The Development Trajectory

Author: Chen Yixin
Hu Shengchao
Shen Li
Tao Dacheng
Zhang Ya
Publication venue
Publication date: 19/01/2023
Field of study

Transformer, originally devised for natural language processing, has also attested significant success in computer vision. Thanks to its super expressive power, researchers are investigating ways to deploy transformers to reinforcement learning (RL) and the transformer-based models have manifested their potential in representative RL benchmarks. In this paper, we collect and dissect recent advances on transforming RL by transformer (transformer-based RL or TRL), in order to explore its development trajectory and future trend. We group existing developments in two categories: architecture enhancement and trajectory optimization, and examine the main applications of TRL in robotic manipulation, text-based games, navigation and autonomous driving. For architecture enhancement, these methods consider how to apply the powerful transformer structure to RL problems under the traditional RL framework, which model agents and environments much more precisely than deep RL methods, but they are still limited by the inherent defects of traditional RL algorithms, such as bootstrapping and "deadly triad". For trajectory optimization, these methods treat RL problems as sequence modeling and train a joint state-action model over entire trajectories under the behavior cloning framework, which are able to extract policies from static datasets and fully use the long-sequence modeling capability of the transformer. Given these advancements, extensions and challenges in TRL are reviewed and proposals about future direction are discussed. We hope that this survey can provide a detailed introduction to TRL and motivate future research in this rapidly developing field.Comment: 26 page

arXiv.org e-Print Archive

Life Long Learning In Sparse Learning Environments

Author: Reeder John
Publication venue: 'Information Bulletin on Variable Stars (IBVS)'
Publication date: 01/01/2013
Field of study

Life long learning is a machine learning technique that deals with learning sequential tasks over time. It seeks to transfer knowledge from previous learning tasks to new learning tasks in order to increase generalization performance and learning speed. Real-time learning environments in which many agents are participating may provide learning opportunities but they are spread out in time and space outside of the geographical scope of a single learning agent. This research seeks to provide an algorithm and framework for life long learning among a network of agents in a sparse real-time learning environment. This work will utilize the robust knowledge representation of neural networks, and make use of both functional and representational knowledge transfer to accomplish this task. A new generative life long learning algorithm utilizing cascade correlation and reverberating pseudo-rehearsal and incorporating a method for merging divergent life long learning paths will be implemented

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

Large network multi-level control for CAV and Smart Infrastructure: AI-based Fog-Cloud collaboration

Author: Chen Sikai
Dong Jiqian
Du Runjia
Ha Paul (Young Joun)
Labi Samuel
Publication venue: 'Purdue University (bepress)'
Publication date: 01/06/2022
Field of study

Purdue E-Pubs

On the Combination of Game-Theoretic Learning and Multi Model Adaptive Filters

Author: Bauso Dario
Qu Hongyang
Smyrnakis Michalis
Veres Sandor
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2021
Field of study

This paper casts coordination of a team of robots within the framework of game theoretic learning algorithms. In particular a novel variant of fictitious play is proposed, by considering multi-model adaptive filters as a method to estimate other players’ strategies. The proposed algorithm can be used as a coordination mechanism between players when they should take decisions under uncertainty. Each player chooses an action after taking into account the actions of the other players and also the uncertainty. Uncertainty can occur either in terms of noisy observations or various types of other players. In addition, in contrast to other game-theoretic and heuristic algorithms for distributed optimisation, it is not necessary to find the optimal parameters a priori. Various parameter values can be used initially as inputs to different models. Therefore, the resulting decisions will be aggregate results of all the parameter values. Simulations are used to test the performance of the proposed methodology against other game-theoretic learning algorithms.</p

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen