Search CORE

3,043 research outputs found

Q-CP: Learning Action Values for Cooperative Planning

Author: Capobianco Roberto
Nardi Daniele
Riccio Francesco
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

Research on multi-robot systems has demonstrated promising results in manifold applications and domains. Still, efficiently learning an effective robot behaviors is very difficult, due to unstructured scenarios, high uncertainties, and large state dimensionality (e.g. hyper-redundant and groups of robot). To alleviate this problem, we present Q-CP a cooperative model-based reinforcement learning algorithm, which exploits action values to both (1) guide the exploration of the state space and (2) generate effective policies. Specifically, we exploit Q-learning to attack the curse-of-dimensionality in the iterations of a Monte-Carlo Tree Search. We implement and evaluate Q-CP on different stochastic cooperative (general-sum) games: (1) a simple cooperative navigation problem among 3 robots, (2) a cooperation scenario between a pair of KUKA YouBots performing hand-overs, and (3) a coordination task between two mobile robots entering a door. The obtained results show the effectiveness of Q-CP in the chosen applications, where action values drive the exploration and reduce the computational demand of the planning process while achieving good performance

arXiv.org e-Print Archive

Crossref

Archivio della ricerca- Università di Roma La Sapienza

Tasks for Agent-Based Negotiation Teams:Analysis, Review, and Challenges

Author: An
An
Ana García-Fornes
Babcock
Barber
Beam
Booth
Brodt
Brodt
Browder
Chu-Carroll
Everaere
Fatima
Gerding
Grosz
Guttman
Halevy
He
He
Heras
In
Ito
Ito
Jennings
Klein
Koc-Menard
Konieczny
Kraus
Lai
Li
Lomuscio
Lopes
Lopes
Luo
Luo
Mannix
McBurney
Morgan
O'Connor
Ortmann
Peterson
Pruitt
Qi
Rahwan
Ren
Rezabakhsh
Rubinstein
Sabater
Sanchez-Anguix
Sanchez-Anguix
Sandhlom
Sarne
Sebenius
Shoham
Sim
Smith
Spector
Such
Such
Thompson
Thompson
Tollison
Tomlin
Vicente Botti
Vicente Julian
Victor Sanchez-Anguix
Publication venue: 'Elsevier BV'
Publication date: 01/01/2013
Field of study

An agent-based negotiation team is a group of interdependent agents that join together as a single negotiation party due to their shared interests in the negotiation at hand. The reasons to employ an agent-based negotiation team may vary: (i) more computation and parallelization capabilities, (ii) unite agents with different expertise and skills whose joint work makes it possible to tackle complex negotiation domains, (iii) the necessity to represent different stakeholders or different preferences in the same party (e.g., organizations, countries, and married couple). The topic of agent-based negotiation teams has been recently introduced in multi-agent research. Therefore, it is necessary to identify good practices, challenges, and related research that may help in advancing the state-of-the-art in agent-based negotiation teams. For that reason, in this article we review the tasks to be carried out by agent-based negotiation teams. Each task is analyzed and related with current advances in different research areas. The analysis aims to identify special challenges that may arise due to the particularities of agent-based negotiation teams.Comment: Engineering Applications of Artificial Intelligence, 201

arXiv.org e-Print Archive

CiteSeerX

Crossref

RiuNet

Coventry University Pure Portal

Towards a Better Understanding of Learning with Multiagent Teams

Author: Brecht Tim
Larson Kate
Radke David
Tilbury Kyle
Publication venue
Publication date: 28/06/2023
Field of study

While it has long been recognized that a team of individual learning agents can be greater than the sum of its parts, recent work has shown that larger teams are not necessarily more effective than smaller ones. In this paper, we study why and under which conditions certain team structures promote effective learning for a population of individual learning agents. We show that, depending on the environment, some team structures help agents learn to specialize into specific roles, resulting in more favorable global results. However, large teams create credit assignment challenges that reduce coordination, leading to large teams performing poorly compared to smaller ones. We support our conclusions with both theoretical analysis and empirical results.Comment: 15 pages, 11 figures, published at the International Joint Conference on Artificial Intelligence (IJCAI) in 202

arXiv.org e-Print Archive

Learning for Multi-robot Cooperation in Partially Observable Stochastic Environments with Macro-actions

Author: Amato Christopher
How Jonathan P.
Liu Miao
Omidshafiei Shayegan
Sivakumar Kavinayan
Publication venue
Publication date: 17/08/2017
Field of study

This paper presents a data-driven approach for multi-robot coordination in partially-observable domains based on Decentralized Partially Observable Markov Decision Processes (Dec-POMDPs) and macro-actions (MAs). Dec-POMDPs provide a general framework for cooperative sequential decision making under uncertainty and MAs allow temporally extended and asynchronous action execution. To date, most methods assume the underlying Dec-POMDP model is known a priori or a full simulator is available during planning time. Previous methods which aim to address these issues suffer from local optimality and sensitivity to initial conditions. Additionally, few hardware demonstrations involving a large team of heterogeneous robots and with long planning horizons exist. This work addresses these gaps by proposing an iterative sampling based Expectation-Maximization algorithm (iSEM) to learn polices using only trajectory data containing observations, MAs, and rewards. Our experiments show the algorithm is able to achieve better solution quality than the state-of-the-art learning-based methods. We implement two variants of multi-robot Search and Rescue (SAR) domains (with and without obstacles) on hardware to demonstrate the learned policies can effectively control a team of distributed robots to cooperate in a partially observable stochastic environment.Comment: Accepted to the 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2017

arXiv.org e-Print Archive

DSpace@MIT

Crossref

Can bounded and self-interested agents be teammates? Application to planning in ad hoc teams

Author: A Brandenburger
B Goodwine
C Boutilier
C Camerer
C Guestrin
CF Camerer
D Koller
DS Bernstein
DV Pynadath
E Kalai
GW Brown
I Gilboa
J Mertens
JA Tatman
JC Harsanyi
K Binmore
L Panait
M Bowling
Muthukumaran Chandrasekaran
P Doshi
P Doshi
P Gmytrasiewicz
Prashant Doshi
R Nair
R Wageman
RJ Aumann
S Seuken
Y Zeng
Yifeng Zeng
Yingke Chen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 23/11/2016
Field of study

Planning for ad hoc teamwork is challenging because it involves agents collaborating without any prior coordination or communication. The focus is on principled methods for a single agent to cooperate with others. This motivates investigating the ad hoc teamwork problem in the context of self-interested decision-making frameworks. Agents engaged in individual decision making in multiagent settings face the task of having to reason about other agents’ actions, which may in turn involve reasoning about others. An established approximation that operationalizes this approach is to bound the infinite nesting from below by introducing level 0 models. For the purposes of this study, individual, self-interested decision making in multiagent settings is modeled using interactive dynamic influence diagrams (I-DID). These are graphical models with the benefit that they naturally offer a factored representation of the problem, allowing agents to ascribe dynamic models to others and reason about them. We demonstrate that an implication of bounded, finitely-nested reasoning by a self-interested agent is that we may not obtain optimal team solutions in cooperative settings, if it is part of a team. We address this limitation by including models at level 0 whose solutions involve reinforcement learning. We show how the learning is integrated into planning in the context of I-DIDs. This facilitates optimal teammate behavior, and we demonstrate its applicability to ad hoc teamwork on several problem domains and configurations

Northumbria University Research Portal

Crossref

Teeside University's Research Repository