Search CORE

10,285 research outputs found

Simple Coalitional Games with Beliefs

Author: Chalkiadakis Georgios
Elkind Edith
Jennings Nick
Publication venue
Publication date: 01/01/2009
Field of study

We introduce coalitional games with beliefs (CGBs), a natural generalization of coalitional games to environments where agents possess private beliefs regarding the capabilities (or types) of others. We put forward a model to capture such agent-type uncertainty, and study coalitional stability in this setting. Specifically, we introduce a notion of the core for CGBs, both with and without coalition structures. For simple games without coalition structures, we then provide a characterization of the core that matches the one for the full information case, and use it to derive a polynomial-time algorithm to check core nonemptiness. In contrast, we demonstrate that in games with coalition structures allowing beliefs increases the computational complexity of stability-related problems. In doing so, we introduce and analyze weighted voting games with beliefs, which may be of independent interest. Finally, we discuss connections between our model and other classes of coalitional games

Southampton (e-Prints Soton)

TransfQMix: transformers for leveraging the graph structure of multi-agent reinforcement learning problems

Author: Gallici Matteo
Martín Muñoz Mario
Masmitjà Rusiñol Ivan
Publication venue: Association for Computing Machinery (ACM)
Publication date: 01/01/2023
Field of study

Coordination is one of the most difficult aspects of multi-agent reinforcement learning (MARL). One reason is that agents normally choose their actions independently of one another. In order to see coordination strategies emerging from the combination of independent policies, the recent research has focused on the use of a centralized function (CF) that learns each agent's contribution to the team reward. However, the structure in which the environment is presented to the agents and to the CF is typically overlooked. We have observed that the features used to describe the coordination problem can be represented as vertex features of a latent graph structure. Here, we present TransfQMix, a new approach that uses transformers to leverage this latent structure and learn better coordination policies. Our transformer agents perform a graph reasoning over the state of the observable entities. Our transformer Q-mixer learns a monotonic mixing-function from a larger graph that includes the internal and external states of the agents. TransfQMix is designed to be entirely transferable, meaning that same parameters can be used to control and train larger or smaller teams of agents. This enables to deploy promising approaches to save training time and derive general policies in MARL, such as transfer learning, zero-shot transfer, and curriculum learning. We report TransfQMix's performances in the Spread and StarCraft II environments. In both settings, it outperforms state-of-the-art Q-Learning models, and it demonstrates effectiveness in solving problems that other methods can not solve.This project has received funding from the EU’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No 893089. This work acknowledges the ‘Severo Ochoa Centre of Excellence’ accreditation (CEX2019-000928-S). We gratefully acknowledge the David and Lucile Packard Foundation.Peer ReviewedPostprint (author's final draft

UPCommons. Portal del coneixement obert de la UPC

Scalable Multi-Agent Reinforcement Learning through Intelligent Information Aggregation

Author: Balakrishnan Hamsa
Choi Kenneth
Ding Wenqi
Dolan Sydney
Gopalakrishnan Karthik
Nayak Siddharth
Publication venue
Publication date: 03/11/2022
Field of study

We consider the problem of multi-agent navigation and collision avoidance when observations are limited to the local neighborhood of each agent. We propose InforMARL, a novel architecture for multi-agent reinforcement learning (MARL) which uses local information intelligently to compute paths for all the agents in a decentralized manner. Specifically, InforMARL aggregates information about the local neighborhood of agents for both the actor and the critic using a graph neural network and can be used in conjunction with any standard MARL algorithm. We show that (1) in training, InforMARL has better sample efficiency and performance than baseline approaches, despite using less information, and (2) in testing, it scales well to environments with arbitrary numbers of agents and obstacles.Comment: 11 pages, 5 figures, 2 tables, 3 pages appendix, Code: https://github.com/nsidn98/InforMAR

arXiv.org e-Print Archive

Learning Transferable Cooperative Behavior in Multi-Agent Team

Author: Agarwal Akshat
Kumar Sumit
Lewis Michael
Sycara Katia
Publication venue: IFMAS
Publication date: 01/01/2020
Field of study

While multi-agent interactions can be naturally modeled as a graph, the environment has traditionally been considered as a black box. To better utilize the inherent structure of our environment, we propose to create a shared agent-entity graph, where agents and environmental entities form vertices, and edges exist between the vertices which can communicate with each other, allowing agents to selectively attend to different parts of the environment, while also introducing invariance to the number of agents or entities present in the system as well as permutation invariance. We present stateof- the-art results on coverage, formation and line control tasks for multi-agent teams in a fully decentralized execution framework

D-Scholarship@Pitt

Cooperative Games with Overlapping Coalitions

Author: Chalkiadakis Georgios
Elkind Edith
Jennings Nicholas Robert
Markakis Evangelos
Polukarov Maria
Publication venue: 'AI Access Foundation'
Publication date: 01/01/2010
Field of study

In the usual models of cooperative game theory, the outcome of a coalition formation process is either the grand coalition or a coalition structure that consists of disjoint coalitions. However, in many domains where coalitions are associated with tasks, an agent may be involved in executing more than one task, and thus may distribute his resources among several coalitions. To tackle such scenarios, we introduce a model for cooperative games with overlapping coalitions--or overlapping coalition formation (OCF) games. We then explore the issue of stability in this setting. In particular, we introduce a notion of the core, which generalizes the corresponding notion in the traditional (non-overlapping) scenario. Then, under some quite general conditions, we characterize the elements of the core, and show that any element of the core maximizes the social welfare. We also introduce a concept of balancedness for overlapping coalitional games, and use it to characterize coalition structures that can be extended to elements of the core. Finally, we generalize the notion of convexity to our setting, and show that under some natural assumptions convex games have a non-empty core. Moreover, we introduce two alternative notions of stability in OCF that allow a wider range of deviations, and explore the relationships among the corresponding definitions of the core, as well as the classic (non-overlapping) core and the Aubin core. We illustrate the general properties of the three cores, and also study them from a computational perspective, thus obtaining additional insights into their fundamental structure

arXiv.org e-Print Archive

CiteSeerX

Southampton (e-Prints Soton)

Oxford University Research Archive

Spiral - Imperial College Digital Repository

Embodied Evolution in Collective Robotics: A Review

Author: Alba
Alba
Amato
Anderson
Aplin
Arthur
Axelrod
Bangel
Barrett
Bayindir
Bedau
Bellingham
Beni
Bentham
Bernard
Bernstein
Bianco
Blount
Bongard
Bongard
Boumaza
Brambilla
Bredeche
Bredeche
Bredeche
Bredeche
Bredeche
Brodbeck
Camazine
Charlesworth
Christensen
Cully
Deutsch
Dibangoye
Doncieux
Eiben
Eiben
Eiben
Eiben
Fernandez Pérez
Fernandez Pérez
Fernandez Pérez
Ferrante
Ficici
Floreano
García-Sánchez
Gauci
Geritz
Good
Haasdijk
Haasdijk
Haasdijk
Haasdijk
Haasdijk
Haasdijk
Haasdijk
Hardin
Hart
Hauert
Heinerman
Heinerman
Hettiarachchi
Hettiarachchi
Huijsman
Jakobi
Karafotias
Kemeling
König
König
Lehman
Long
Maynard Smith
Mitri
Montanier
Montanier
Montanier
Moor
Mouret
Mouret
Nelson
Nolfi
Nordin
Noskov
Nouyan
O’Dowd
Parker
Perez
Pfeifer
Prieto
Prieto
Prieto
Prieto
Pugh
Ray
Rubenstein
Schut
Schwarzer
Schwarzer
Shapley
Silva
Silva
Silva
Silva
Silva
Simões
Soros
Stanley
Steyven
Stone
Stone
Stone
Taylor
Thrun
Tonelli
Trianni
Trueba
Trueba
Trueba
Trueba
Urzelai
Usui
Vanderelst
Waibel
Wakeley
Walker
Watson
Weel
Weel
Werfel
West
Wischmann
Wiser
Wolpert
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2018
Field of study

This paper provides an overview of evolutionary robotics techniques applied to on-line distributed evolution for robot collectives -- namely, embodied evolution. It provides a definition of embodied evolution as well as a thorough description of the underlying concepts and mechanisms. The paper also presents a comprehensive summary of research published in the field since its inception (1999-2017), providing various perspectives to identify the major trends. In particular, we identify a shift from considering embodied evolution as a parallel search method within small robot collectives (fewer than 10 robots) to embodied evolution as an on-line distributed learning method for designing collective behaviours in swarm-like collectives. The paper concludes with a discussion of applications and open questions, providing a milestone for past and an inspiration for future research.Comment: 23 pages, 1 figure, 1 tabl

arXiv.org e-Print Archive

VU Research Portal

Crossref

Directory of Open Access Journals

Frontiers - Publisher Connector

Randomized Entity-wise Factorization for Multi-Agent Reinforcement Learning

Author: Böhmer Wendelin
de Witt Christian A. Schroeder
Iqbal Shariq
Peng Bei
Sha Fei
Whiteson Shimon
Publication venue
Publication date: 11/06/2021
Field of study

Multi-agent settings in the real world often involve tasks with varying types and quantities of agents and non-agent entities; however, common patterns of behavior often emerge among these agents/entities. Our method aims to leverage these commonalities by asking the question: ``What is the expected utility of each agent when only considering a randomly selected sub-group of its observed entities?'' By posing this counterfactual question, we can recognize state-action trajectories within sub-groups of entities that we may have encountered in another task and use what we learned in that task to inform our prediction in the current one. We then reconstruct a prediction of the full returns as a combination of factors considering these disjoint groups of entities and train this ``randomly factorized" value function as an auxiliary objective for value-based multi-agent reinforcement learning. By doing so, our model can recognize and leverage similarities across tasks to improve learning efficiency in a multi-task setting. Our approach, Randomized Entity-wise Factorization for Imagined Learning (REFIL), outperforms all strong baselines by a significant margin in challenging multi-task StarCraft micromanagement settings.Comment: ICML 2021 Camera Read

arXiv.org e-Print Archive

Oxford University Research Archive