    Multi-agent persistent surveillance under temporal logic constraints

    This thesis proposes algorithms for the deployment of multiple autonomous agents for persistent surveillance missions requiring repeated, periodic visits to regions of interest. Such problems arise in a variety of domains, such as monitoring ocean conditions like temperature and algae content, performing crowd security during public events, tracking wildlife in remote or dangerous areas, or watching traffic patterns and road conditions. Using robots for surveillance is an attractive solution to scenarios in which fixed sensors are not sufficient to maintain situational awareness. Multi-agent solutions are particularly promising, because they allow for improved spatial and temporal resolution of sensor information. In this work, we consider persistent monitoring by teams of agents that are tasked with satisfying missions specified using temporal logic formulas. Such formulas allow rich, complex tasks to be specified, such as "visit regions A and B infinitely often, and if region C is visited then go to region D, and always avoid obstacles." The agents must determine how to satisfy such missions according to fuel, communication, and other constraints. Such problems are inherently difficult due to the typically infinite horizon, state space explosion from planning for multiple agents, communication constraints, and other issues. Therefore, computing an optimal solution to these problems is often infeasible. Instead, a balance must be struck between computational complexity and optimality. This thesis describes solution methods for two main classes of multi-agent persistent surveillance problems. First, it considers the class of problems in which persistent surveillance goals are captured entirely by TL constraints. Such problems require agents to repeatedly visit a set of surveillance regions in order to satisfy their mission. We present results for agents solving such missions with charging constraints, with noisy observations, and in the presence of adversaries. The second class of problems include an additional optimality criterion, such as minimizing uncertainty about the location of a target or maximizing sensor information among the team of agents. We present solution methods and results for such missions with a variety of optimality criteria based on information metrics. For both classes of problems, the proposed algorithms are implemented and evaluated via simulation, experiments with robots in a motion capture environment, or both

    Motion planning and control: a formal methods approach

    Control of complex systems satisfying rich temporal specification has become an increasingly important research area in fields such as robotics, control, automotive, and manufacturing. Popular specification languages include temporal logics, such as Linear Temporal Logic (LTL) and Computational Tree Logic (CTL), which extend propositional logic to capture the temporal sequencing of system properties. The focus of this dissertation is on the control of high-dimensional systems and on timed specifications that impose explicit time bounds on the satisfaction of tasks. This work proposes and evaluates methods and algorithms for synthesizing provably correct control policies that deal with the scalability problems. Ideas and tools from formal verification, graph theory, and incremental computing are used to synthesize satisfying control strategies. Finite abstractions of the systems are generated, and then composed with automata encoding the specifications. The first part of this dissertation introduces a sampling-based motion planning algorithm that combines long-term temporal logic goals with short-term reactive requirements. The specification has two parts: (1) a global specification given as an LTL formula over a set of static service requests that occur at the regions of a known environment, and (2) a local specification that requires servicing a set of dynamic requests that can be sensed locally during the execution. The proposed computational framework consists of two main ingredients: (a) an off-line sampling-based algorithm for the construction of a global transition system that contains a path satisfying the LTL formula, and (b) an on-line sampling-based algorithm to generate paths that service the local requests, while making sure that the satisfaction of the global specification is not affected. The second part of the dissertation focuses on stochastic systems with temporal and uncertainty constraints. A specification language called Gaussian Distribution Temporal Logic is introduced as an extension of Boolean logic that incorporates temporal evolution and noise mitigation directly into the task specifications. A sampling-based algorithm to synthesize control policies is presented that generates a transition system in the belief space and uses local feedback controllers to break the curse of history associated with belief space planning. Switching control policies are then computed using a product Markov Decision Process between the transition system and the Rabin automaton encoding the specification.The approach is evaluated in experiments using a camera network and ground robot. The third part of this dissertation focuses on control of multi-vehicle systems with timed specifications and charging constraints. A rich expressivity language called Time Window Temporal Logic (TWTL) that describes time bounded specifications is introduced. The temporal relaxation of TWTL formulae with respect to the deadlines of tasks is also discussed. The key ingredient of the solution is an algorithm to translate a TWTL formula to an annotated finite state automaton that encodes all possible temporal relaxations of the given formula. The annotated automata are composed with transition systems encoding the motion of all vehicles, and with charging models to produce control strategies for all vehicles such that the overall system satisfies the mission specification. The methods are evaluated in simulation and experimental trials with quadrotors and charging stations

    Provably-Correct Task Planning for Autonomous Outdoor Robots

    Autonomous outdoor robots should be able to accomplish complex tasks safely and reliably while considering constraints that arise from both the environment and the physical platform. Such tasks extend basic navigation capabilities to specify a sequence of events over time. For example, an autonomous aerial vehicle can be given a surveillance task with contingency plans while complying with rules in regulated airspace, or an autonomous ground robot may need to guarantee a given probability of success while searching for the quickest way to complete the mission. A promising approach for the automatic synthesis of trusted controllers for complex tasks is to employ techniques from formal methods. In formal methods, tasks are formally specified symbolically with temporal logic. The robot then synthesises a controller automatically to execute trusted behaviour that guarantees the satisfaction of specified tasks and regulations. However, a difficulty arises from the lack of expressivity, which means the constraints affecting outdoor robots cannot be specified naturally with temporal logic. The goal of this thesis is to extend the capabilities of formal methods to express the constraints that arise from outdoor applications and synthesise provably-correct controllers with trusted behaviours over time. This thesis focuses on two important types of constraints, resource and safety constraints, and presents three novel algorithms that express tasks with these constraints and synthesise controllers that satisfy the specification. Firstly, this thesis proposes an extension to probabilistic computation tree logic (PCTL) called resource threshold PCTL (RT-PCTL) that naturally defines the mission specification with continuous resource threshold constraints; furthermore, it synthesises an optimal control policy with respect to the probability of success. With RT-PCTL, a state with accumulated resource out of the specified bound is considered to be failed or saturated depending on the specification. The requirements on resource bounds are naturally encoded in the symbolic specification, followed by the automatic synthesis of an optimal controller with respect to the probability of success. Secondly, the thesis proposes an online algorithm called greedy Buchi algorithm (GBA) that reduces the synthesis problem size to avoid the scalability problem. A framework is then presented with realistic control dynamics and physical assumptions in the environment such as wind estimation and fuel constraints. The time and space complexity for the framework is polynomial in the size of the system state, which is efficient for online synthesis. Lastly, the thesis proposes a synthesis algorithm for an optimal controller with respect to completion time given the minimum safety constraints. The algorithm naturally balances between completion time and safety. This work proves an analytical relationship between the probability of success and the conditional completion time given the mission specification. The theoretical contributions in this thesis are validated through realistic simulation examples. This thesis identifies and solves two core problems that contribute to the overall vision of developing a theoretical basis for trusted behaviour in outdoor robots. These contributions serve as a foundation for further research in multi-constrained task planning where a number of different constraints are considered simultaneously within a single framework

    A Policy Search Method For Temporal Logic Specified Reinforcement Learning Tasks

    Reward engineering is an important aspect of reinforcement learning. Whether or not the user's intentions can be correctly encapsulated in the reward function can significantly impact the learning outcome. Current methods rely on manually crafted reward functions that often require parameter tuning to obtain the desired behavior. This operation can be expensive when exploration requires systems to interact with the physical world. In this paper, we explore the use of temporal logic (TL) to specify tasks in reinforcement learning. TL formula can be translated to a real-valued function that measures its level of satisfaction against a trajectory. We take advantage of this function and propose temporal logic policy search (TLPS), a model-free learning technique that finds a policy that satisfies the TL specification. A set of simulated experiments are conducted to evaluate the proposed approach

    A control architecture and human interface for agile, reconfigurable micro aerial vehicle formations

    This thesis considers the problem of controlling a group of micro aerial vehicles for agile maneuvering cooperatively, or distributively. We first introduce the background and motivation for micro aerial vehicles, especially for the popular multi-rotor aerial vehicle platform. Then, we discuss the dynamics of quadrotor helicopters. A quadrotor is a specific kind of multi-rotor aerial vehicle with a special property called differential flatness, which simplifies the algorithm of trajectory planning, such that, instead of planning a trajectory in a 12-dimensional state space and 4-dimensional input space, we only need to plan the trajectory in 4-dimensional, so called, flat output space, while the 12-dimensional state and 4-dimensional input can be recovered from a mapping called endogenous transformation. We propose a series of approaches to achieve agile maneuvering of a dynamic quadrotor formation, from controlling a single quadrotor in an artificial vector field, to controlling a group of quadrotors in a Virtual Rigid Body (VRB) framework, to balancing the effect between the human control and autonomy for collision avoidance, and to fast on-line distributed collision avoidance with Buffered Voronoi Cells (BVC). In the vector field method, we generate velocity, acceleration, jerk and snap fields, depending on the tasks, or the positions of obstacles, such that a single quadrotor can easily find its required state and input from the endogenous transformation in order to track the artificial vector field. Next, with a Virtual Rigid Body framework, we let a group of quadrotors follow a single control command while also keeping a required formation, or even reconfigure from one formation to another. The Virtual Rigid Body framework decouples the trajectory planning problem into two sub-problems. Then we consider the problem of collision avoidance of the quadrotor formation when it is meanwhile tele-operated by a single human operator. The autonomy with collision avoidance algorithm, based on the vector field methods for a single quadrotor, is an assistive portion of the quadrotor formation controller, such that the human operator can focus on his/her high-level tasks, leaving the low-level collision avoidance task be handled automatically. We also consider the full autonomy problem of quadrotor formations when reconfiguring from one formation to another by developing a fast, on-line distributed collision avoidance algorithm using Buffered Voronoi Cells (BVCs). Our BVC based collision avoidance algorithm only requires sensed relative position, rather than relative position and velocity, while the computational complexity is comparable to other methods like velocity obstacles. At last, we introduce our experimental quadrotor platform which is built from PixHawk flight controller and Odroid-XU4 single-board computer. The hardware and software architecture of this multiple-quadrotor platform is described in detail so that our platform can easily be adopted and extended with different purposes. Our conclusion remark and discussion of future work are also given in this thesi

    Autonomous Flight, Fault, and Energy Management of the Flying Fish Solar-Powered Seaplane.

    The Flying Fish autonomous unmanned seaplane is designed and built for persistent ocean surveillance. Solar energy harvesting and always-on autonomous control and guidance are required to achieve unattended long-term operation. This thesis describes the Flying Fish avionics and software systems that enable the system to plan, self-initiate, and autonomously execute drift-flight cycles necessary to maintain a designated watch circle subject to environmentally influenced drift. We first present the avionics and flight software architecture developed for the unique challenges of an autonomous energy-harvesting seaplane requiring the system to be: waterproof, robust over a variety of sea states, and lightweight for flight. Seaplane kinematics and dynamics are developed based on conventional aircraft and watercraft and upon empirical flight test data. These models serve as the basis for development of flight control and guidance strategies which take the form of a cyclic multi-mode guidance protocol that smoothly transitions between nested gain-scheduled proportional-derivative feedback control laws tuned for the trim conditions of each flight mode. A fault-tolerant airspeed sensing system is developed in response to elevated failure rates arising from pitot probe water ingestion in the test environment. The fault-tolerance strategy utilizes sensor characteristics and signal energy to combine redundant sensor measurements in a weighted voting strategy, handling repeated failures, sensor recovery, non-homogenous sensors, and periods of complete sensing failure. Finally, a graph-based mission planner combines models of global solar energy, local ocean-currents, and wind with flight-verified/derived aircraft models to provide an energy-aware flight planning tool. An NP-hard asymmetric multi-visit traveling salesman planning problem is posed that integrates vehicle performance and environment models using energy as the primary cost metric. A novel A* search heuristic is presented to improve search efficiency relative to uniform cost search. A series of cases studies are conducted with surface and airborne goals for various times of day and for multi-day scenarios. Energy-optimal solutions are identified except in cases where energy harvesting produces multiple comparable-cost plans via negative-cost cycles. The always-on cyclic guidance/control system, airspeed sensor fault management algorithm, and the nested-TSP heuristic for A* are all critical innovation required to solve the posed research challenges.Ph.D.Aerospace EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/91453/1/eubankrd_1.pd

    Dynamic Coverage Control and Estimation in Collaborative Networks of Human-Aerial/Space Co-Robots

    In this dissertation, the author presents a set of control, estimation, and decision making strategies to enable small unmanned aircraft systems and free-flying space robots to act as intelligent mobile wireless sensor networks. These agents are primarily tasked with gathering information from their environments in order to increase the situational awareness of both the network as well as human collaborators. This information is gathered through an abstract sensing model, a forward facing anisotropic spherical sector, which can be generalized to various sensing models through adjustment of its tuning parameters. First, a hybrid control strategy is derived whereby a team of unmanned aerial vehicles can dynamically cover (i.e., sweep their sensing footprints through all points of a domain over time) a designated airspace. These vehicles are assumed to have finite power resources; therefore, an agent deployment and scheduling protocol is proposed that allows for agents to return periodically to a charging station while covering the environment. Rules are also prescribed with respect to energy-aware domain partitioning and agent waypoint selection so as to distribute the coverage load across the network with increased priority on those agents whose remaining power supply is larger. This work is extended to consider the coverage of 2D manifolds embedded in 3D space that are subject to collision by stochastic intruders. Formal guarantees are provided with respect to collision avoidance, timely convergence upon charging stations, and timely interception of intruders by friendly agents. This chapter concludes with a case study in which a human acts as a dynamic coverage supervisor, i.e., they use hand gestures so as to direct the selection of regions which ought to be surveyed by the robot. Second, the concept of situational awareness is extended to networks consisting of humans working in close proximity with aerial or space robots. In this work, the robot acts as an assistant to a human attempting to complete a set of interdependent and spatially separated multitasking objectives. The human wears an augmented reality display and the robot must learn the human's task locations online and broadcast camera views of these tasks to the human. The locations of tasks are learned using a parallel implementation of expectation maximization of Gaussian mixture models. The selection of tasks from this learned set is executed by a Markov Decision Process which is trained using Q-learning by the human. This method for robot task selection is compared against a supervised method in IRB approved (HUM00145810) experimental trials with 24 human subjects. This dissertation concludes by discussing an additional case study, by the author, in Bayesian inferred path planning. In addition, open problems in dynamic coverage and human-robot interaction are discussed so as to present an avenue forward for future work.PHDAerospace EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/155147/1/wbentz_1.pd

    Time Window Temporal Logic

    This paper introduces time window temporal logic (TWTL), a rich expressivity language for describing various time bounded specifications. In particular, the syntax and semantics of TWTL enable the compact representation of serial tasks, which are typically seen in robotics and control applications. This paper also discusses the relaxation of TWTL formulae with respect to deadlines of tasks. Efficient automata-based frameworks to solve synthesis, verification and learning problems are also presented. The key ingredient to the presented solution is an algorithm to translate a TWTL formula to an annotated finite state automaton that encodes all possible temporal relaxations of the specification. Case studies illustrating the expressivity of the logic and the proposed algorithms are included