5,474 research outputs found
Q-CP: Learning Action Values for Cooperative Planning
Research on multi-robot systems has demonstrated promising results in manifold applications and domains. Still, efficiently learning an effective robot behaviors is very difficult, due to unstructured scenarios, high uncertainties, and large state dimensionality (e.g. hyper-redundant and groups of robot). To alleviate this problem, we present Q-CP a cooperative model-based reinforcement learning algorithm, which exploits action values to both (1) guide the exploration of the state space and (2) generate effective policies. Specifically, we exploit Q-learning to attack the curse-of-dimensionality in the iterations of a Monte-Carlo Tree Search. We implement and evaluate Q-CP on different stochastic cooperative (general-sum) games: (1) a simple cooperative navigation problem among 3 robots, (2) a cooperation scenario between a pair of KUKA YouBots performing hand-overs, and (3) a coordination task between two mobile robots entering a door. The obtained results show the effectiveness of Q-CP in the chosen applications, where action values drive the exploration and reduce the computational demand of the planning process while achieving good performance
Influence of Team Interactions on Multi-Robot Cooperation: A Relational Network Perspective
Relational networks within a team play a critical role in the performance of
many real-world multi-robot systems. To successfully accomplish tasks that
require cooperation and coordination, different agents (e.g., robots)
necessitate different priorities based on their positioning within the team.
Yet, many of the existing multi-robot cooperation algorithms regard agents as
interchangeable and lack a mechanism to guide the type of cooperation strategy
the agents should exhibit. To account for the team structure in cooperative
tasks, we propose a novel algorithm that uses a relational network comprising
inter-agent relationships to prioritize certain agents over others. Through
appropriate design of the team's relational network, we can guide the
cooperation strategy, resulting in the emergence of new behaviors that
accomplish the specified task. We conducted six experiments in a multi-robot
setting with a cooperative task. Our results demonstrate that the proposed
method can effectively influence the type of solution that the algorithm
converges to by specifying the relationships between the agents, making it a
promising approach for tasks that require cooperation among agents with a
specified team structure.Comment: Accepted to Multi-Robot and Multi-Agent Systems (IEEE MRS 2023
Deep Reinforcement Learning for Swarm Systems
Recently, deep reinforcement learning (RL) methods have been applied
successfully to multi-agent scenarios. Typically, these methods rely on a
concatenation of agent states to represent the information content required for
decentralized decision making. However, concatenation scales poorly to swarm
systems with a large number of homogeneous agents as it does not exploit the
fundamental properties inherent to these systems: (i) the agents in the swarm
are interchangeable and (ii) the exact number of agents in the swarm is
irrelevant. Therefore, we propose a new state representation for deep
multi-agent RL based on mean embeddings of distributions. We treat the agents
as samples of a distribution and use the empirical mean embedding as input for
a decentralized policy. We define different feature spaces of the mean
embedding using histograms, radial basis functions and a neural network learned
end-to-end. We evaluate the representation on two well known problems from the
swarm literature (rendezvous and pursuit evasion), in a globally and locally
observable setup. For the local setup we furthermore introduce simple
communication protocols. Of all approaches, the mean embedding representation
using neural network features enables the richest information exchange between
neighboring agents facilitating the development of more complex collective
strategies.Comment: 31 pages, 12 figures, version 3 (published in JMLR Volume 20
- …