Search CORE

713 research outputs found

Learning to Coordinate Efficiently: A Model-based Approach

Author: Brafman R. I.
Tennenholtz M.
Publication venue: 'AI Access Foundation'
Publication date: 26/06/2011
Field of study

In common-interest stochastic games all players receive an identical payoff. Players participating in such games must learn to coordinate with each other in order to receive the highest-possible value. A number of reinforcement learning algorithms have been proposed for this problem, and some have been shown to converge to good solutions in the limit. In this paper we show that using very simple model-based algorithms, much better (i.e., polynomial) convergence rates can be attained. Moreover, our model-based algorithms are guaranteed to converge to the optimal value, unlike many of the existing algorithms

arXiv.org e-Print Archive

Crossref

Learning to coordinate in a complex and non-stationary world

Author: A. Cavagna
A. Cavagna
A. De Martino
A. Rustichini
C. Camerer
D. Challet
D. Challet
D. Fudenberg
D. Fudenberg
D. L. Mc Fadden
F. Ricci-Tersenghi
H. A. Simon
J. P. Bouchaud
M. Marsili
M. Marsili
M. Mézard
R. Mulet
R. Savit
R. Zecchina
W. B. Arthur
Publication venue: 'American Physical Society (APS)'
Publication date: 01/01/2001
Field of study

We study analytically and by computer simulations a complex system of adaptive agents with finite memory. Borrowing the framework of the Minority Game and using the replica formalism we show the existence of an equilibrium phase transition as a function of the ratio between the memory

\lambda

and the learning rates

\Gamma

of the agents. We show that, starting from a random configuration, a dynamic phase transition also exists, which prevents the system from reaching any Nash equilibria. Furthermore, in a non-stationary environment, we show by numerical simulations that agents with infinite memory play worst than others with less memory and that the dynamic transition naturally arises independently from the initial conditions.Comment: 4 pages, 3 figure

arXiv.org e-Print Archive

Crossref

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Eldorado - Ressourcen aus und für Lehre, Studium und Forschung

Archivio della ricerca- Università di Roma La Sapienza

Learning to Coordinate with Anyone

Author: Chen Feng
Guan Cong
Li Lihe
Yu Yang
Yuan Lei
Zhang Tianyi
Zhang Ziqian
Zhou Zhi-Hua
Publication venue
Publication date: 22/09/2023
Field of study

In open multi-agent environments, the agents may encounter unexpected teammates. Classical multi-agent learning approaches train agents that can only coordinate with seen teammates. Recent studies attempted to generate diverse teammates to enhance the generalizable coordination ability, but were restricted by pre-defined teammates. In this work, our aim is to train agents with strong coordination ability by generating teammates that fully cover the teammate policy space, so that agents can coordinate with any teammates. Since the teammate policy space is too huge to be enumerated, we find only dissimilar teammates that are incompatible with controllable agents, which highly reduces the number of teammates that need to be trained with. However, it is hard to determine the number of such incompatible teammates beforehand. We therefore introduce a continual multi-agent learning process, in which the agent learns to coordinate with different teammates until no more incompatible teammates can be found. The above idea is implemented in the proposed Macop (Multi-agent compatible policy learning) algorithm. We conduct experiments in 8 scenarios from 4 environments that have distinct coordination patterns. Experiments show that Macop generates training teammates with much lower compatibility than previous methods. As a result, in all scenarios Macop achieves the best overall coordination ability while never significantly worse than the baselines, showing strong generalization ability

arXiv.org e-Print Archive

Stabilize to Act: Learning to Coordinate for Bimanual Manipulation

Author: Grannen Jennifer
Sadigh Dorsa
Vu Brandon
Wu Yilin
Publication venue
Publication date: 03/09/2023
Field of study

Key to rich, dexterous manipulation in the real world is the ability to coordinate control across two hands. However, while the promise afforded by bimanual robotic systems is immense, constructing control policies for dual arm autonomous systems brings inherent difficulties. One such difficulty is the high-dimensionality of the bimanual action space, which adds complexity to both model-based and data-driven methods. We counteract this challenge by drawing inspiration from humans to propose a novel role assignment framework: a stabilizing arm holds an object in place to simplify the environment while an acting arm executes the task. We instantiate this framework with BimanUal Dexterity from Stabilization (BUDS), which uses a learned restabilizing classifier to alternate between updating a learned stabilization position to keep the environment unchanged, and accomplishing the task with an acting policy learned from demonstrations. We evaluate BUDS on four bimanual tasks of varying complexities on real-world robots, such as zipping jackets and cutting vegetables. Given only 20 demonstrations, BUDS achieves 76.9% task success across our task suite, and generalizes to out-of-distribution objects within a class with a 52.7% success rate. BUDS is 56.0% more successful than an unstructured baseline that instead learns a BC stabilizing policy due to the precision required of these complex tasks. Supplementary material and videos can be found at https://sites.google.com/view/stabilizetoact .Comment: Conference on Robot Learning, 202

arXiv.org e-Print Archive

Recommended from our members

Learning to coordinate in sparse asymmetric multiagent systems

Author: Dixit Gaurav
Publication venue: 'Oregon State University'
Publication date
Field of study

Multiagent learning offers a rich framework to address challenging real-world problems such as remote exploration and healthcare coordination, which require autonomous agents to express elaborate interactions. To be effective in such systems, agents must collectively reason about and pursue high-level, long-term, and possibly nebulous objectives while adapting their strategy to changing environments, inter-agent relationships, and team dynamics. This work introduces six contributions that address this multifaceted problem through the lens of two distinct perspectives: reward structures for high-level objectives that allow agents to consider behaviors before pursuing them, and diversity structures that incentivize asymmetric agents (agents with distinct capabilities and egocentric objectives) to discover complementary specializations required for robust teamwork. The first contribution, Asymmetric D++, distills sparse team feedback into dense informative rewards by encouraging agents to create asymmetric counterfactuals based on their likelihood to cooperate. The second contribution introduces an uncertainty-aware reward approximation that enables the application of Asymmetric D++ for exploration and learning in sparse reward settings. The third contribution, Behavior Refinement, presents a hierarchical framework that shifts focus from optimizing a single behavior to learning a repertoire of diverse behaviors required to complete variegated tasks. Behavior Refinement allows systematic exploration of the policy space via a combination of diversity search and team-objective maximization. The fourth contribution introduces the Island Model, a computational framework that builds on Behavior Refinement for informed behavior space exploration and team balancing for asymmetric agents. The final two contributions, expand upon the Island Model to develop an asynchronous learning framework that allows asymmetric agents to explore diverse environment-agnostic inter-agent relationships to balance multiple potentially conflicting objectives. The amalgamation of this work facilitates asymmetric agents to learn diverse specializations, express complex trade-offs, and discover robust inter-agent relationships required to solve challenging coordination problems. Additionally, the techniques introduced in this work aid in investigating the rich tapestry of agent synergies that evolve in response to changes in the environment and team objectives.Keywords: Multiagent Reinforcement Learning, Multiagent Coordination, Asymmetric Multiagent Systems, Multiagent Evolutionary Learnin

ScholarsArchive@OSU

Communicative Bottlenecks Lead to Maximal Information Transfer

Author: LaCroix Travis
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2020
Field of study

This paper presents new analytic and numerical analysis of signalling games that give rise to informational bottlenecks—that is to say, signalling games with more state/act pairs than available signals to communicate information about the world. I show via simulation that agents learning to coordinate tend to favour partitions of nature which provide maximal information transfer. This is true despite the fact that nothing from an initial analysis of the stability properties of the underlying signalling game suggests that this should be the case. As a first pass to explain this, I note that the underlying structure of our model favours maximal information transfer in regard to the simple combinatorial properties of how the agents might partition nature into kinds. However, I suggest that this does not perfectly capture the empirical results; thus, several open questions remain

PhilSci Archive

Measuring collaborative emergent behavior in multi-agent reinforcement learning

Author: E Rovira
G Klein
L Matignon
R Parasuraman
V Mnih
Publication venue
Publication date: 23/07/2018
Field of study

Multi-agent reinforcement learning (RL) has important implications for the future of human-agent teaming. We show that improved performance with multi-agent RL is not a guarantee of the collaborative behavior thought to be important for solving multi-agent tasks. To address this, we present a novel approach for quantitatively assessing collaboration in continuous spatial tasks with multi-agent RL. Such a metric is useful for measuring collaboration between computational agents and may serve as a training signal for collaboration in future RL paradigms involving humans.Comment: 1st International Conference on Human Systems Engineering and Design, 6 pages, 2 figures, 1 tabl

arXiv.org e-Print Archive

Crossref