10,285 research outputs found

    Simple Coalitional Games with Beliefs

    No full text
    We introduce coalitional games with beliefs (CGBs), a natural generalization of coalitional games to environments where agents possess private beliefs regarding the capabilities (or types) of others. We put forward a model to capture such agent-type uncertainty, and study coalitional stability in this setting. Specifically, we introduce a notion of the core for CGBs, both with and without coalition structures. For simple games without coalition structures, we then provide a characterization of the core that matches the one for the full information case, and use it to derive a polynomial-time algorithm to check core nonemptiness. In contrast, we demonstrate that in games with coalition structures allowing beliefs increases the computational complexity of stability-related problems. In doing so, we introduce and analyze weighted voting games with beliefs, which may be of independent interest. Finally, we discuss connections between our model and other classes of coalitional games

    TransfQMix: transformers for leveraging the graph structure of multi-agent reinforcement learning problems

    Get PDF
    Coordination is one of the most difficult aspects of multi-agent reinforcement learning (MARL). One reason is that agents normally choose their actions independently of one another. In order to see coordination strategies emerging from the combination of independent policies, the recent research has focused on the use of a centralized function (CF) that learns each agent's contribution to the team reward. However, the structure in which the environment is presented to the agents and to the CF is typically overlooked. We have observed that the features used to describe the coordination problem can be represented as vertex features of a latent graph structure. Here, we present TransfQMix, a new approach that uses transformers to leverage this latent structure and learn better coordination policies. Our transformer agents perform a graph reasoning over the state of the observable entities. Our transformer Q-mixer learns a monotonic mixing-function from a larger graph that includes the internal and external states of the agents. TransfQMix is designed to be entirely transferable, meaning that same parameters can be used to control and train larger or smaller teams of agents. This enables to deploy promising approaches to save training time and derive general policies in MARL, such as transfer learning, zero-shot transfer, and curriculum learning. We report TransfQMix's performances in the Spread and StarCraft II environments. In both settings, it outperforms state-of-the-art Q-Learning models, and it demonstrates effectiveness in solving problems that other methods can not solve.This project has received funding from the EU’s Horizon 2020 research and innovation programme under the Marie SkƂodowska-Curie grant agreement No 893089. This work acknowledges the ‘Severo Ochoa Centre of Excellence’ accreditation (CEX2019-000928-S). We gratefully acknowledge the David and Lucile Packard Foundation.Peer ReviewedPostprint (author's final draft

    Scalable Multi-Agent Reinforcement Learning through Intelligent Information Aggregation

    Full text link
    We consider the problem of multi-agent navigation and collision avoidance when observations are limited to the local neighborhood of each agent. We propose InforMARL, a novel architecture for multi-agent reinforcement learning (MARL) which uses local information intelligently to compute paths for all the agents in a decentralized manner. Specifically, InforMARL aggregates information about the local neighborhood of agents for both the actor and the critic using a graph neural network and can be used in conjunction with any standard MARL algorithm. We show that (1) in training, InforMARL has better sample efficiency and performance than baseline approaches, despite using less information, and (2) in testing, it scales well to environments with arbitrary numbers of agents and obstacles.Comment: 11 pages, 5 figures, 2 tables, 3 pages appendix, Code: https://github.com/nsidn98/InforMAR

    Learning Transferable Cooperative Behavior in Multi-Agent Team

    Get PDF
    While multi-agent interactions can be naturally modeled as a graph, the environment has traditionally been considered as a black box. To better utilize the inherent structure of our environment, we propose to create a shared agent-entity graph, where agents and environmental entities form vertices, and edges exist between the vertices which can communicate with each other, allowing agents to selectively attend to different parts of the environment, while also introducing invariance to the number of agents or entities present in the system as well as permutation invariance. We present stateof- the-art results on coverage, formation and line control tasks for multi-agent teams in a fully decentralized execution framework

    Cooperative Games with Overlapping Coalitions

    Get PDF
    In the usual models of cooperative game theory, the outcome of a coalition formation process is either the grand coalition or a coalition structure that consists of disjoint coalitions. However, in many domains where coalitions are associated with tasks, an agent may be involved in executing more than one task, and thus may distribute his resources among several coalitions. To tackle such scenarios, we introduce a model for cooperative games with overlapping coalitions--or overlapping coalition formation (OCF) games. We then explore the issue of stability in this setting. In particular, we introduce a notion of the core, which generalizes the corresponding notion in the traditional (non-overlapping) scenario. Then, under some quite general conditions, we characterize the elements of the core, and show that any element of the core maximizes the social welfare. We also introduce a concept of balancedness for overlapping coalitional games, and use it to characterize coalition structures that can be extended to elements of the core. Finally, we generalize the notion of convexity to our setting, and show that under some natural assumptions convex games have a non-empty core. Moreover, we introduce two alternative notions of stability in OCF that allow a wider range of deviations, and explore the relationships among the corresponding definitions of the core, as well as the classic (non-overlapping) core and the Aubin core. We illustrate the general properties of the three cores, and also study them from a computational perspective, thus obtaining additional insights into their fundamental structure

    Embodied Evolution in Collective Robotics: A Review

    Full text link
    This paper provides an overview of evolutionary robotics techniques applied to on-line distributed evolution for robot collectives -- namely, embodied evolution. It provides a definition of embodied evolution as well as a thorough description of the underlying concepts and mechanisms. The paper also presents a comprehensive summary of research published in the field since its inception (1999-2017), providing various perspectives to identify the major trends. In particular, we identify a shift from considering embodied evolution as a parallel search method within small robot collectives (fewer than 10 robots) to embodied evolution as an on-line distributed learning method for designing collective behaviours in swarm-like collectives. The paper concludes with a discussion of applications and open questions, providing a milestone for past and an inspiration for future research.Comment: 23 pages, 1 figure, 1 tabl

    Randomized Entity-wise Factorization for Multi-Agent Reinforcement Learning

    Full text link
    Multi-agent settings in the real world often involve tasks with varying types and quantities of agents and non-agent entities; however, common patterns of behavior often emerge among these agents/entities. Our method aims to leverage these commonalities by asking the question: ``What is the expected utility of each agent when only considering a randomly selected sub-group of its observed entities?'' By posing this counterfactual question, we can recognize state-action trajectories within sub-groups of entities that we may have encountered in another task and use what we learned in that task to inform our prediction in the current one. We then reconstruct a prediction of the full returns as a combination of factors considering these disjoint groups of entities and train this ``randomly factorized" value function as an auxiliary objective for value-based multi-agent reinforcement learning. By doing so, our model can recognize and leverage similarities across tasks to improve learning efficiency in a multi-task setting. Our approach, Randomized Entity-wise Factorization for Imagined Learning (REFIL), outperforms all strong baselines by a significant margin in challenging multi-task StarCraft micromanagement settings.Comment: ICML 2021 Camera Read
    • 

    corecore