46 research outputs found

    Robot Patrolling for Stochastic and Adversarial Events

    Get PDF
    In this thesis, we present and analyze two robot patrolling problems. The first problem discusses stochastic patrolling strategies in adversarial environments where intruders use the information about a patrolling path to increase chances of successful attacks on the environment. We use Markov chains to design the random patrolling paths on graphs. We present four different intruder models, each of which use the information about patrolling paths in a different manner. We characterize the expected rewards for those intruder models as a function of the Markov chain that is being used for patrolling. We show that minimizing the reward functions is a non convex constrained optimization problem in general. We then discuss the application of different numerical optimization methods to minimize the expected reward for any given type of intruder and propose a pattern search algorithm to determine a locally optimal patrolling strategy. We also show that for a certain type of intruder, a deterministic patrolling policy given by the orienteering tour of the graph is the optimal patrolling strategy. The second problem that we define and analyze is the Event Detection and Confirmation Problem in which the events arrive randomly on the vertices of a graph and stay active for a random amount of time. The events that stay longer than a certain amount of time are defined to be true events. The monitoring robot can traverse the graph to detect newly arrived events and can revisit these events in order to classify them as true events. The goal is to maximize the number of true events that are correctly classified by the robot. We show that the off-line version of the problem is NP-hard. We then consider a simple patrolling policy based on the TSP tour of the graph and characterize the probability of correctly classifying a true event. We investigate the problem when multiple robots follow the same path, and show that the optimal spacing between the robots in that case can be non uniform

    An Energy-aware, Fault-tolerant, and Robust Deep Reinforcement Learning based approach for Multi-agent Patrolling Problems

    Full text link
    Autonomous vehicles are suited for continuous area patrolling problems. However, finding an optimal patrolling strategy can be challenging for many reasons. Firstly, patrolling environments are often complex and can include unknown environmental factors. Secondly, autonomous vehicles can have failures or hardware constraints, such as limited battery life. Importantly, patrolling large areas often requires multiple agents that need to collectively coordinate their actions. In this work, we consider these limitations and propose an approach based on model-free, deep multi-agent reinforcement learning. In this approach, the agents are trained to automatically recharge themselves when required, to support continuous collective patrolling. A distributed homogeneous multi-agent architecture is proposed, where all patrolling agents execute identical policies locally based on their local observations and shared information. This architecture provides a fault-tolerant and robust patrolling system that can tolerate agent failures and allow supplementary agents to be added to replace failed agents or to increase the overall patrol performance. The solution is validated through simulation experiments from multiple perspectives, including the overall patrol performance, the efficiency of battery recharging strategies, and the overall fault tolerance and robustness

    Effective Cooperation and Scalability in Multi-Robot Teams for Automatic Patrolling of Infrastructures

    Get PDF
    Tese de doutoramento em Engenharia Electrotécnica e de Computadores, apresentada ao Departamento de Engenharia Electrotécnica e de Computadores da Faculdade de Ciências e Tecnologia da Universidade de CoimbraIn the digital era that we live in, advances in technology have proliferated throughout our society, quickening the completion of tasks that were painful in the old days, improving solutions to the everyday problems that we face, and generally assisting human beings both in their professional and personal life. Robotics is a clear example of a broad technological field that evolves every day. In fact, scientists predict that in the upcoming few decades, robots will naturally interact and coexist alongside human beings. While it is true that robots already have a strong presence in industrial environments, e.g., robotic arms for manufacturing, the average person still looks upon robots with suspicion, since they are not acquainted by such type of technology. In this thesis, the author deploys teams of mobile robots in indoor scenarios to cooperatively perform patrolling missions, which represents an effort to bring robots closer to humans and assist them in monotonous or repetitive tasks, such as supervising and monitoring indoor infrastructures or simply cooperatively cleaning floors. In this context, the team of robots should be able to sense the environment, localize and navigate autonomously between way points while avoiding obstacles, incorporate any number of robots, communicate actions in a distributed way and being robust not only to agent failures but also communication failures, so as to effectively coordinate to achieve optimal collective performance. The referred capabilities are an evidence that such systems can only prove their reliability in real-world environments if robots are endowed with intelligence and autonomy. Thus, the author follows a line of research where patrolling units have the necessary tools for intelligent decision-making, according to the information of the mission, the environment and teammates' actions, using distributed coordination architectures. An incremental approach is followed. Firstly, the problem is presented and the literature is deeply studied in order to identify potential weaknesses and research opportunities, backing up the objectives and contributions proposed in this thesis. Then, problem fundamentals are described and benchmarking of multi-robot patrolling algorithms in realistic conditions is conducted. In these earlier stages, the role of different parameters of the problem, like environment connectivity, team size and strategy philosophy, will become evident through extensive empirical results and statistical analysis. In addition, scalability is deeply analyzed and tied with inter-robot interference and coordination, imposed by each patrolling strategy. After gaining sensibility to the problem, preliminary models for multi-robot patrol with special focus on real-world application are presented, using a Bayesian inspired formalism. Based on these, distributed strategies that lead to superior team performance are described. Interference between autonomous agents is explicitly dealt with, and the approaches are shown to scale to large teams of robots. Additionally, the robustness to agent and communication failures is demonstrated, as well as the flexibility of the model proposed. In fact, by later generalizing the model with learning agents and maintaining memory of past events, it is then shown that these capabilities can be inherited, while at the same time increasing team performance even further and fostering adaptability. This is verified in simulation experiments and real-world results in a large indoor scenario. Furthermore, since the issue of team scalability is highly in focus in this thesis, a method for estimating the optimal team size in a patrolling mission, according to the environment topology is proposed. Upper bounds for team performance prior to the mission start are provided, supporting the choice of the number of robots to be used so that temporal constraints can be satisfied. All methods developed in this thesis are tested and corroborated by experimental results, showing the usefulness of employing cooperative teams of robots in real-world environments and the potential for similar systems to emerge in our society.FCT - SFRH/BD/64426/200

    UAV-UGV-UMV Multi-Swarms for Cooperative Surveillance

    Get PDF
    In this paper we present a surveillance system for early detection of escapers from a restricted area based on a new swarming mobility model called CROMM-MS (Chaotic Rössler Mobility Model for Multi-Swarms). CROMM-MS is designed for controlling the trajectories of heterogeneous multi-swarms of aerial, ground and marine unmanned vehicles with important features such as prioritising early detections and success rate. A new Competitive Coevolutionary Genetic Algorithm (CompCGA) is proposed to optimise the vehicles’ parameters and escapers’ evasion ability using a predator-prey approach. Our results show that CROMM-MS is not only viable for surveillance tasks but also that its results are competitive in regard to the state-of-the-art approaches

    Multi-Robot Path Planning for Persistent Monitoring in Stochastic and Adversarial Environments

    Get PDF
    In this thesis, we study multi-robot path planning problems for persistent monitoring tasks. The goal of such persistent monitoring tasks is to deploy a team of cooperating mobile robots in an environment to continually observe locations of interest in the environment. Robots patrol the environment in order to detect events arriving at the locations of the environment. The events stay at those locations for a certain amount of time before leaving and can only be detected if one of the robots visits the location of an event while the event is there. In order to detect all possible events arriving at a vertex, the maximum time spent by the robots between visits to that vertex should be less than the duration of the events arriving at that vertex. We consider the problem of finding the minimum number of robots to satisfy these revisit time constraints, also called latency constraints. The decision version of this problem is PSPACE-complete. We provide an O(log p) approximation algorithm for this problem where p is the ratio of the maximum and minimum latency constraints. We also present heuristic algorithms to solve the problem and show through simulations that a proposed orienteering-based heuristic algorithm gives better solutions than the approximation algorithm. We additionally provide an algorithm for the problem of minimizing the maximum weighted latency given a fixed number of robots. In case the event stay durations are not fixed but are drawn from a known distribution, we consider the problem of maximizing the expected number of detected events. We motivate randomized patrolling paths for such scenarios and use Markov chains to represent those random patrolling paths. We characterize the expected number of detected events as a function of the Markov chains used for patrolling and show that the objective function is submodular for randomly arriving events. We propose an approximation algorithm for the case where the event durations for all the vertices is a constant. We also propose a centralized and an online distributed algorithm to find the random patrolling policies for the robots. We also consider the case where the events are adversarial and can choose where and when to appear in order to maximize their chances of remaining undetected. The last problem we study in this thesis considers events triggered by a learning adversary. The adversary has a limited time to observe the patrolling policy before it decides when and where events should appear. We study the single robot version of this problem and model this problem as a multi-stage two player game. The adversary observes the patroller’s actions for a finite amount of time to learn the patroller’s strategy and then either chooses a location for the event to appear or reneges based on its confidence in the learned strategy. We characterize the expected payoffs for the players and propose a search algorithm to find a patrolling policy in such scenarios. We illustrate the trade off between hard to learn and hard to attack strategies through simulations

    Patrolling security games: Definition and algorithms for solving largeinstances with single patroller and single intruder

    Get PDF
    Security games are gaining significant interest in artificial intelligence. They are characterized by two players (a defender and an attacker) and by a set of targets the defender tries to protect from the attacker\u2bcs intrusions by committing to a strategy. To reach their goals, players use resources such as patrollers and intruders. Security games are Stackelberg games where the appropriate solution concept is the leader\u2013follower equilibrium. Current algorithms for solving these games are applicable when the underlying game is in normal form (i.e., each player has a single decision node). In this paper, we define and study security games with an extensive-form infinite-horizon underlying game, where decision nodes are potentially infinite. We introduce a novel scenario where the attacker can undertake actions during the execution of the defender\u2bcs strategy. We call this new game class patrolling security games (PSGs), since its most prominent application is patrolling environments against intruders. We show that PSGs cannot be reduced to security games studied so far and we highlight their generality in tackling adversarial patrolling on arbitrary graphs. We then design algorithms to solve large instances with single patroller and single intruder

    Semi-Informed Multi-Agent Patrol Strategies

    Get PDF
    The adversarial multi-agent patrol problem is an active research topic with many real-world applications such as physical robots guarding an area and software agents protecting a computer network. In it, agents patrol a graph looking for so-called critical vertices that are subject to attack by adversaries. The agents are unaware of which vertices are subject to attack by adversaries and when they encounter such a vertex they attempt to protect it from being compromised (an adversary must occupy the vertex it targets a certain amount of time for the attack to succeed). Even though the terms adversary and attack are used, the problem domain extends to patrolling a graph for other interesting noncompetitive contexts such as search and rescue. The problem statement adopted in this work is formulated such that agents obtain knowledge of local graph topology and critical vertices over the course of their travels via an API ; there is no global knowledge of the graph or communication between agents. The challenge is to balance exploration, necessary to discover critical vertices, with exploitation, necessary to protect critical vertices from attack. Four types of adversaries were used for experiments, three from previous research – waiting, random, and statistical - and the fourth, a hybrid of those three. Agent strategies for countering each of these adversaries are designed and evaluated. Benchmark graphs and parameter settings from related research will be employed. The proposed research culminates in the design and evaluation of agents to counter these various types of adversaries under a range of conditions. The results of this work are agent strategies in which each agent becomes solely responsible for protecting those critical vertices it discovers. The agents use emergent behavior to minimize successful attacks and maximize the discovery of new critical vertices. A set of seven edge choosing primitives (ECPs) are defined that are combined in different ways to yield a range of agent strategies using the chain of responsibility OOP design pattern. Every permutation of them were tested and measured in order to identify those strategies that perform well. One strategy performed particularly well against all adversaries, graph topology, and other experimental variables. This particular strategy combines ECPs of: A hard-deadline return to covered vertices to counter the random adversary, efficiently checking vertices to see if they are being attacked by the waiting adversary, and random movement to impede the statistical adversary

    Dynamic Coverage Control and Estimation in Collaborative Networks of Human-Aerial/Space Co-Robots

    Full text link
    In this dissertation, the author presents a set of control, estimation, and decision making strategies to enable small unmanned aircraft systems and free-flying space robots to act as intelligent mobile wireless sensor networks. These agents are primarily tasked with gathering information from their environments in order to increase the situational awareness of both the network as well as human collaborators. This information is gathered through an abstract sensing model, a forward facing anisotropic spherical sector, which can be generalized to various sensing models through adjustment of its tuning parameters. First, a hybrid control strategy is derived whereby a team of unmanned aerial vehicles can dynamically cover (i.e., sweep their sensing footprints through all points of a domain over time) a designated airspace. These vehicles are assumed to have finite power resources; therefore, an agent deployment and scheduling protocol is proposed that allows for agents to return periodically to a charging station while covering the environment. Rules are also prescribed with respect to energy-aware domain partitioning and agent waypoint selection so as to distribute the coverage load across the network with increased priority on those agents whose remaining power supply is larger. This work is extended to consider the coverage of 2D manifolds embedded in 3D space that are subject to collision by stochastic intruders. Formal guarantees are provided with respect to collision avoidance, timely convergence upon charging stations, and timely interception of intruders by friendly agents. This chapter concludes with a case study in which a human acts as a dynamic coverage supervisor, i.e., they use hand gestures so as to direct the selection of regions which ought to be surveyed by the robot. Second, the concept of situational awareness is extended to networks consisting of humans working in close proximity with aerial or space robots. In this work, the robot acts as an assistant to a human attempting to complete a set of interdependent and spatially separated multitasking objectives. The human wears an augmented reality display and the robot must learn the human's task locations online and broadcast camera views of these tasks to the human. The locations of tasks are learned using a parallel implementation of expectation maximization of Gaussian mixture models. The selection of tasks from this learned set is executed by a Markov Decision Process which is trained using Q-learning by the human. This method for robot task selection is compared against a supervised method in IRB approved (HUM00145810) experimental trials with 24 human subjects. This dissertation concludes by discussing an additional case study, by the author, in Bayesian inferred path planning. In addition, open problems in dynamic coverage and human-robot interaction are discussed so as to present an avenue forward for future work.PHDAerospace EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/155147/1/wbentz_1.pd
    corecore