21 research outputs found
Efficient Distributed Task Allocation based on Autonomous Organizational Formation using Cooperative and Reciprocal Relationships
早大学位記番号:新7797早稲田大
Decentralised Coordination in RoboCup Rescue
Emergency responders are faced with a number of significant challenges when managing major disasters. First, the number of rescue tasks posed is usually larger than the number of responders (or agents) and the resources available to them. Second, each task is likely to require a different level of effort in order to be completed by its deadline. Third, new tasks may continually appear or disappear from the environment, thus requiring the responders to quickly recompute their allocation of resources. Fourth, forming teams or coalitions of multiple agents from different agencies is vital since no single agency will have all the resources needed to save victims, unblock roads, and extinguish the ?res which might erupt in the disaster space. Given this, coalitions have to be efficiently selected and scheduled to work across the disaster space so as to maximise the number of lives and the portion of the infrastructure saved. In particular, it is important that the selection of such coalitions should be performed in a decentralised fashion in order to avoid a single point of failure in the system. Moreover, it is critical that responders communicate only locally given they are likely to have limited battery power or minimal access to long range communication devices. Against this background, we provide a novel decentralised solution to the coalition formation process that pervades disaster management. More specifically, we model the emergency management scenario defined in the RoboCup Rescue disaster simulation platform as a Coalition Formation with Spatial and Temporal constraints (CFST) problem where agents form coalitions in order to complete tasks, each with different demands. In order to design a decentralised algorithm for CFST we formulate it as a Distributed Constraint Optimisation problem and show how to solve it using the state-of-the-art Max-Sum algorithm that provides a completely decentralised message-passing solution. We then provide a novel algorithm (F-Max-Sum) that avoids sending redundant messages and efficiently adapts to changes in the environment. In empirical evaluations, our algorithm is shown to generate better solutions than other decentralised algorithms used for this problem
Efficient Inter-Team Task Allocation in RoboCup Rescue
The coordination of cooperative agents involved in rescue missions is an important open research problem. We consider the RoboCup Rescue Simulation (RCS) challenge, where teams of agents perform urban rescue operations. Previous approaches typically cast such problem as separate single-team allocation problems. However, different teams have complementary capabilities, and therefore some kind of inter-team coordination is desirable for high-quality solutions. Our contribution considers inter-team coordination using Max-Sum. We present a methodology that allows teams in RCS to efficiently assess joint allocations. Furthermore, we show how to reduce the algorithm's computational complexity from exponential to polynomial time by using Tractable High Order Potentials. To the best of our knowledge this is the first time where it has been shown that MS can be run in polynomial time in the RCS challenge without relaxing the problem. Experiments with fire brigades and police agents show that teams employing inter-team coordination are significantly more effective than uncoordinated teams. Moreover, the evaluation shows that our BMS and THOPs method achieves up to 2.5 times better results than other state-of-the-art methods. Copyright © 2015, International Foundation for Autonomous Agents and Multiagent Systems.Work funded by projects DAMAS (TIN2013-45732-C4-4-P), COR (TIN2012-38876-C02-01), the Generalitat of Catalunya grant 2009-SGR-1434, and the Ministry of Economy and Competitivity grant BES-2010-030466.Peer reviewe
Shapley Q-value: A Local Reward Approach to Solve Global Reward Games
Cooperative game is a critical research area in the multi-agent reinforcement
learning (MARL). Global reward game is a subclass of cooperative games, where
all agents aim to maximize the global reward. Credit assignment is an important
problem studied in the global reward game. Most of previous works stood by the
view of non-cooperative-game theoretical framework with the shared reward
approach, i.e., each agent being assigned a shared global reward directly.
This, however, may give each agent an inaccurate reward on its contribution to
the group, which could cause inefficient learning. To deal with this problem,
we i) introduce a cooperative-game theoretical framework called extended convex
game (ECG) that is a superset of global reward game, and ii) propose a local
reward approach called Shapley Q-value. Shapley Q-value is able to distribute
the global reward, reflecting each agent's own contribution in contrast to the
shared reward approach. Moreover, we derive an MARL algorithm called Shapley
Q-value deep deterministic policy gradient (SQDDPG), using Shapley Q-value as
the critic for each agent. We evaluate SQDDPG on Cooperative Navigation,
Prey-and-Predator and Traffic Junction, compared with the state-of-the-art
algorithms, e.g., MADDPG, COMA, Independent DDPG and Independent A2C. In the
experiments, SQDDPG shows a significant improvement on the convergence rate.
Finally, we plot Shapley Q-value and validate the property of fair credit
assignment
LSAR: Multi-UAV Collaboration for Search and Rescue Missions
In this paper, we consider the use of a team of multiple unmanned aerial vehicles (UAVs) to
accomplish a search and rescue (SAR) mission in the minimum time possible while saving the maximum
number of people. A novel technique for the SAR problem is proposed and referred to as the layered search
and rescue (LSAR) algorithm. The novelty of LSAR involves simulating real disasters to distribute SAR
tasks among UAVs. The performance of LSAR is compared, in terms of percentage of rescued survivors
and rescue and execution times, with the max-sum, auction-based, and locust-inspired approaches for multi
UAV task allocation (LIAM) and opportunistic task allocation (OTA) schemes. The simulation results show
that the UAVs running the LSAR algorithm on average rescue approximately 74% of the survivors, which
is 8% higher than the next best algorithm (LIAM). Moreover, this percentage increases with the number
of UAVs, almost linearly with the least slope, which means more scalability and coverage is obtained
in comparison to other algorithms. In addition, the empirical cumulative distribution function of LSAR
results shows that the percentages of rescued survivors clustered around the [78% 100%] range under an
exponential curve, meaning most results are above 50%. In comparison, all the other algorithms have almost
equal distributions of their percentage of rescued survivor results. Furthermore, because the LSAR algorithm
focuses on the center of the disaster, it nds more survivors and rescues them faster than the other algorithms,
with an average of 55% 77%. Moreover, most registered times to rescue survivors by LSAR are bounded
by a time of 04:50:02 with 95% con dence for a one-month mission time.info:eu-repo/semantics/publishedVersio
Task Allocation into a Foraging Task with a Series of Subtasks in Swarm Robotic System
This is the final version. Available from IEEE via the DOI in this record. In swarm robotic systems, task allocation is a challenging problem aiming to decompose complex tasks into a series of subtasks. We propose a self-organizing method to allocate a swarm of robots to perform a foraging task consisting of sequentially dependent subtasks. The method regulates the proportion of robots to meet the task demands for given tasks. Our proposed method is based on the response threshold model, mapping the intensity of task demands to the probability of responding to candidate tasks depending on the response threshold. Each robot is suitable for all tasks but some robots have higher probability of taking certain tasks and lower probability of taking others. In our task allocation method, each robot updates its response threshold depending on the associated task demand as well as the number of neighbouring robots performing the task. It relies neither on a centralized mechanism nor on information exchange amongst robots. Repetitive and continuous task allocations lead to the desired task distribution at a swarm level. We also analyzed the mathematical convergence of the task distribution among a swarm of robots. We demonstrate that the method is effective and robust for a foraging task under various conditions on the number of robots, the number of tasks and the size of the arena. Our simulation results may support the hypothesis that social insects use a task allocation method to handle the foraging task required for a colony’s survival