21 research outputs found

    Decentralized Coordination in RoboCup Rescue

    Full text link

    Decentralised Coordination in RoboCup Rescue

    No full text
    Emergency responders are faced with a number of significant challenges when managing major disasters. First, the number of rescue tasks posed is usually larger than the number of responders (or agents) and the resources available to them. Second, each task is likely to require a different level of effort in order to be completed by its deadline. Third, new tasks may continually appear or disappear from the environment, thus requiring the responders to quickly recompute their allocation of resources. Fourth, forming teams or coalitions of multiple agents from different agencies is vital since no single agency will have all the resources needed to save victims, unblock roads, and extinguish the ?res which might erupt in the disaster space. Given this, coalitions have to be efficiently selected and scheduled to work across the disaster space so as to maximise the number of lives and the portion of the infrastructure saved. In particular, it is important that the selection of such coalitions should be performed in a decentralised fashion in order to avoid a single point of failure in the system. Moreover, it is critical that responders communicate only locally given they are likely to have limited battery power or minimal access to long range communication devices. Against this background, we provide a novel decentralised solution to the coalition formation process that pervades disaster management. More specifically, we model the emergency management scenario defined in the RoboCup Rescue disaster simulation platform as a Coalition Formation with Spatial and Temporal constraints (CFST) problem where agents form coalitions in order to complete tasks, each with different demands. In order to design a decentralised algorithm for CFST we formulate it as a Distributed Constraint Optimisation problem and show how to solve it using the state-of-the-art Max-Sum algorithm that provides a completely decentralised message-passing solution. We then provide a novel algorithm (F-Max-Sum) that avoids sending redundant messages and efficiently adapts to changes in the environment. In empirical evaluations, our algorithm is shown to generate better solutions than other decentralised algorithms used for this problem

    Efficient Inter-Team Task Allocation in RoboCup Rescue

    Get PDF
    The coordination of cooperative agents involved in rescue missions is an important open research problem. We consider the RoboCup Rescue Simulation (RCS) challenge, where teams of agents perform urban rescue operations. Previous approaches typically cast such problem as separate single-team allocation problems. However, different teams have complementary capabilities, and therefore some kind of inter-team coordination is desirable for high-quality solutions. Our contribution considers inter-team coordination using Max-Sum. We present a methodology that allows teams in RCS to efficiently assess joint allocations. Furthermore, we show how to reduce the algorithm's computational complexity from exponential to polynomial time by using Tractable High Order Potentials. To the best of our knowledge this is the first time where it has been shown that MS can be run in polynomial time in the RCS challenge without relaxing the problem. Experiments with fire brigades and police agents show that teams employing inter-team coordination are significantly more effective than uncoordinated teams. Moreover, the evaluation shows that our BMS and THOPs method achieves up to 2.5 times better results than other state-of-the-art methods. Copyright © 2015, International Foundation for Autonomous Agents and Multiagent Systems.Work funded by projects DAMAS (TIN2013-45732-C4-4-P), COR (TIN2012-38876-C02-01), the Generalitat of Catalunya grant 2009-SGR-1434, and the Ministry of Economy and Competitivity grant BES-2010-030466.Peer reviewe

    Shapley Q-value: A Local Reward Approach to Solve Global Reward Games

    Full text link
    Cooperative game is a critical research area in the multi-agent reinforcement learning (MARL). Global reward game is a subclass of cooperative games, where all agents aim to maximize the global reward. Credit assignment is an important problem studied in the global reward game. Most of previous works stood by the view of non-cooperative-game theoretical framework with the shared reward approach, i.e., each agent being assigned a shared global reward directly. This, however, may give each agent an inaccurate reward on its contribution to the group, which could cause inefficient learning. To deal with this problem, we i) introduce a cooperative-game theoretical framework called extended convex game (ECG) that is a superset of global reward game, and ii) propose a local reward approach called Shapley Q-value. Shapley Q-value is able to distribute the global reward, reflecting each agent's own contribution in contrast to the shared reward approach. Moreover, we derive an MARL algorithm called Shapley Q-value deep deterministic policy gradient (SQDDPG), using Shapley Q-value as the critic for each agent. We evaluate SQDDPG on Cooperative Navigation, Prey-and-Predator and Traffic Junction, compared with the state-of-the-art algorithms, e.g., MADDPG, COMA, Independent DDPG and Independent A2C. In the experiments, SQDDPG shows a significant improvement on the convergence rate. Finally, we plot Shapley Q-value and validate the property of fair credit assignment

    LSAR: Multi-UAV Collaboration for Search and Rescue Missions

    Get PDF
    In this paper, we consider the use of a team of multiple unmanned aerial vehicles (UAVs) to accomplish a search and rescue (SAR) mission in the minimum time possible while saving the maximum number of people. A novel technique for the SAR problem is proposed and referred to as the layered search and rescue (LSAR) algorithm. The novelty of LSAR involves simulating real disasters to distribute SAR tasks among UAVs. The performance of LSAR is compared, in terms of percentage of rescued survivors and rescue and execution times, with the max-sum, auction-based, and locust-inspired approaches for multi UAV task allocation (LIAM) and opportunistic task allocation (OTA) schemes. The simulation results show that the UAVs running the LSAR algorithm on average rescue approximately 74% of the survivors, which is 8% higher than the next best algorithm (LIAM). Moreover, this percentage increases with the number of UAVs, almost linearly with the least slope, which means more scalability and coverage is obtained in comparison to other algorithms. In addition, the empirical cumulative distribution function of LSAR results shows that the percentages of rescued survivors clustered around the [78% 100%] range under an exponential curve, meaning most results are above 50%. In comparison, all the other algorithms have almost equal distributions of their percentage of rescued survivor results. Furthermore, because the LSAR algorithm focuses on the center of the disaster, it nds more survivors and rescues them faster than the other algorithms, with an average of 55% 77%. Moreover, most registered times to rescue survivors by LSAR are bounded by a time of 04:50:02 with 95% con dence for a one-month mission time.info:eu-repo/semantics/publishedVersio

    Task Allocation into a Foraging Task with a Series of Subtasks in Swarm Robotic System

    Get PDF
    This is the final version. Available from IEEE via the DOI in this record. In swarm robotic systems, task allocation is a challenging problem aiming to decompose complex tasks into a series of subtasks. We propose a self-organizing method to allocate a swarm of robots to perform a foraging task consisting of sequentially dependent subtasks. The method regulates the proportion of robots to meet the task demands for given tasks. Our proposed method is based on the response threshold model, mapping the intensity of task demands to the probability of responding to candidate tasks depending on the response threshold. Each robot is suitable for all tasks but some robots have higher probability of taking certain tasks and lower probability of taking others. In our task allocation method, each robot updates its response threshold depending on the associated task demand as well as the number of neighbouring robots performing the task. It relies neither on a centralized mechanism nor on information exchange amongst robots. Repetitive and continuous task allocations lead to the desired task distribution at a swarm level. We also analyzed the mathematical convergence of the task distribution among a swarm of robots. We demonstrate that the method is effective and robust for a foraging task under various conditions on the number of robots, the number of tasks and the size of the arena. Our simulation results may support the hypothesis that social insects use a task allocation method to handle the foraging task required for a colony’s survival
    corecore