94 research outputs found

    An Energy-aware, Fault-tolerant, and Robust Deep Reinforcement Learning based approach for Multi-agent Patrolling Problems

    Full text link
    Autonomous vehicles are suited for continuous area patrolling problems. However, finding an optimal patrolling strategy can be challenging for many reasons. Firstly, patrolling environments are often complex and can include unknown environmental factors. Secondly, autonomous vehicles can have failures or hardware constraints, such as limited battery life. Importantly, patrolling large areas often requires multiple agents that need to collectively coordinate their actions. In this work, we consider these limitations and propose an approach based on model-free, deep multi-agent reinforcement learning. In this approach, the agents are trained to automatically recharge themselves when required, to support continuous collective patrolling. A distributed homogeneous multi-agent architecture is proposed, where all patrolling agents execute identical policies locally based on their local observations and shared information. This architecture provides a fault-tolerant and robust patrolling system that can tolerate agent failures and allow supplementary agents to be added to replace failed agents or to increase the overall patrol performance. The solution is validated through simulation experiments from multiple perspectives, including the overall patrol performance, the efficiency of battery recharging strategies, and the overall fault tolerance and robustness

    Mobile agent path planning under uncertain environment using reinforcement learning and probabilistic model checking

    Get PDF
    The major challenge in mobile agent path planning, within an uncertain environment, is effectively determining an optimal control model to discover the target location as quickly as possible and evaluating the control system's reliability. To address this challenge, we introduce a learning-verification integrated mobile agent path planning method to achieve both the effectiveness and the reliability. More specifically, we first propose a modified Q-learning algorithm (a popular reinforcement learning algorithm), called Q EA−learning algorithm, to find the best Q-table in the environment. We then determine the location transition probability matrix, and establish a probability model using the assumption that the agent selects a location with a higher Q-value. Secondly, the learnt behaviour of the mobile agent based on Q EA−learning algorithm, is formalized as a Discrete-time Markov Chain (DTMC) model. Thirdly, the required reliability requirements of the mobile agent control system are specified using Probabilistic Computation Tree Logic (PCTL). In addition, the DTMC model and the specified properties are taken as the input of the Probabilistic Model Checker PRISM for automatic verification. This is preformed to evaluate and verify the control system's reliability. Finally, a case study of a mobile agent walking in a grids map is used to illustrate the proposed learning algorithm. Here we have a special focus on the modelling approach demonstrating how PRISM can be used to analyse and evaluate the reliability of the mobile agent control system learnt via the proposed algorithm. The results show that the path identified using the proposed integrated method yields the largest expected reward.</p

    An Auction-based Coordination Strategy for Task-Constrained Multi-Agent Stochastic Planning with Submodular Rewards

    Full text link
    In many domains such as transportation and logistics, search and rescue, or cooperative surveillance, tasks are pending to be allocated with the consideration of possible execution uncertainties. Existing task coordination algorithms either ignore the stochastic process or suffer from the computational intensity. Taking advantage of the weakly coupled feature of the problem and the opportunity for coordination in advance, we propose a decentralized auction-based coordination strategy using a newly formulated score function which is generated by forming the problem into task-constrained Markov decision processes (MDPs). The proposed method guarantees convergence and at least 50% optimality in the premise of a submodular reward function. Furthermore, for the implementation on large-scale applications, an approximate variant of the proposed method, namely Deep Auction, is also suggested with the use of neural networks, which is evasive of the troublesome for constructing MDPs. Inspired by the well-known actor-critic architecture, two Transformers are used to map observations to action probabilities and cumulative rewards respectively. Finally, we demonstrate the performance of the two proposed approaches in the context of drone deliveries, where the stochastic planning for the drone league is cast into a stochastic price-collecting Vehicle Routing Problem (VRP) with time windows. Simulation results are compared with state-of-the-art methods in terms of solution quality, planning efficiency and scalability.Comment: 17 pages, 5 figure

    Decentralized task allocation for multiple UAVs with task execution uncertainties

    Get PDF
    This work builds on a robust decentralized task allocation algorithm to address the multiple unmanned aerial vehicle (UAV) surveillance problem under task duration uncertainties. Considering the existing robust task allocation algorithm is computationally intensive and also has no optimality guarantees, this paper proposes a new robust task assignment formulation that reduces the calculation of robust scores and provides a certain theoretical guarantee of optimality. In the proposed method, the Markov model is introduced to describe the impact of uncertain parameters on task rewards and the expected score function is reformulated as the utility function of the states in the Markov model. Through providing the high-precision expected marginal gain of tasks, the task assignment gains a better accumulative score than the state of arts robust algorithms do. Besides, this algorithm is proven to be convergent and could reach a prior optimality guarantee of at least 50%. Numerical Simulations demonstrate the performance improvement of the proposed method compared with basic CBBA, robust extension to CBBA and cost-benefit greedy algorithm

    Planning Algorithms for Multi-Robot Active Perception

    Get PDF
    A fundamental task of robotic systems is to use on-board sensors and perception algorithms to understand high-level semantic properties of an environment. These semantic properties may include a map of the environment, the presence of objects, or the parameters of a dynamic field. Observations are highly viewpoint dependent and, thus, the performance of perception algorithms can be improved by planning the motion of the robots to obtain high-value observations. This motivates the problem of active perception, where the goal is to plan the motion of robots to improve perception performance. This fundamental problem is central to many robotics applications, including environmental monitoring, planetary exploration, and precision agriculture. The core contribution of this thesis is a suite of planning algorithms for multi-robot active perception. These algorithms are designed to improve system-level performance on many fronts: online and anytime planning, addressing uncertainty, optimising over a long time horizon, decentralised coordination, robustness to unreliable communication, predicting plans of other agents, and exploiting characteristics of perception models. We first propose the decentralised Monte Carlo tree search algorithm as a generally-applicable, decentralised algorithm for multi-robot planning. We then present a self-organising map algorithm designed to find paths that maximally observe points of interest. Finally, we consider the problem of mission monitoring, where a team of robots monitor the progress of a robotic mission. A spatiotemporal optimal stopping algorithm is proposed and a generalisation for decentralised monitoring. Experimental results are presented for a range of scenarios, such as marine operations and object recognition. Our analytical and empirical results demonstrate theoretically-interesting and practically-relevant properties that support the use of the approaches in practice

    Multi-Robot Coverage Path Planning for Inspection of Offshore Wind Farms: A Review

    Get PDF
    Offshore wind turbine (OWT) inspection research is receiving increasing interest as the sector grows worldwide. Wind farms are far from emergency services and experience extreme weather and winds. This hazardous environment lends itself to unmanned approaches, reducing human exposure to risk. Increasing automation in inspections can reduce human effort and financial costs. Despite the benefits, research on automating inspection is sparse. This work proposes that OWT inspection can be described as a multi-robot coverage path planning problem. Reviews of multi-robot coverage exist, but to the best of our knowledge, none captures the domain-specific aspects of an OWT inspection. In this paper, we present a review on the current state of the art of multi-robot coverage to identify gaps in research relating to coverage for OWT inspection. To perform a qualitative study, the PICo (population, intervention, and context) framework was used. The retrieved works are analysed according to three aspects of coverage approaches: environmental modelling, decision making, and coordination. Based on the reviewed studies and the conducted analysis, candidate approaches are proposed for the structural coverage of an OWT. Future research should involve the adaptation of voxel-based ray-tracing pose generation to UAVs and exploration, applying semantic labels to tasks to facilitate heterogeneous coverage and semantic online task decomposition to identify the coverage target during the run time.</jats:p
    • …
    corecore