212 research outputs found

    Multi-agent persistent surveillance under temporal logic constraints

    Full text link
    This thesis proposes algorithms for the deployment of multiple autonomous agents for persistent surveillance missions requiring repeated, periodic visits to regions of interest. Such problems arise in a variety of domains, such as monitoring ocean conditions like temperature and algae content, performing crowd security during public events, tracking wildlife in remote or dangerous areas, or watching traffic patterns and road conditions. Using robots for surveillance is an attractive solution to scenarios in which fixed sensors are not sufficient to maintain situational awareness. Multi-agent solutions are particularly promising, because they allow for improved spatial and temporal resolution of sensor information. In this work, we consider persistent monitoring by teams of agents that are tasked with satisfying missions specified using temporal logic formulas. Such formulas allow rich, complex tasks to be specified, such as "visit regions A and B infinitely often, and if region C is visited then go to region D, and always avoid obstacles." The agents must determine how to satisfy such missions according to fuel, communication, and other constraints. Such problems are inherently difficult due to the typically infinite horizon, state space explosion from planning for multiple agents, communication constraints, and other issues. Therefore, computing an optimal solution to these problems is often infeasible. Instead, a balance must be struck between computational complexity and optimality. This thesis describes solution methods for two main classes of multi-agent persistent surveillance problems. First, it considers the class of problems in which persistent surveillance goals are captured entirely by TL constraints. Such problems require agents to repeatedly visit a set of surveillance regions in order to satisfy their mission. We present results for agents solving such missions with charging constraints, with noisy observations, and in the presence of adversaries. The second class of problems include an additional optimality criterion, such as minimizing uncertainty about the location of a target or maximizing sensor information among the team of agents. We present solution methods and results for such missions with a variety of optimality criteria based on information metrics. For both classes of problems, the proposed algorithms are implemented and evaluated via simulation, experiments with robots in a motion capture environment, or both

    Reinforcement Learning and Planning for Preference Balancing Tasks

    Get PDF
    Robots are often highly non-linear dynamical systems with many degrees of freedom, making solving motion problems computationally challenging. One solution has been reinforcement learning (RL), which learns through experimentation to automatically perform the near-optimal motions that complete a task. However, high-dimensional problems and task formulation often prove challenging for RL. We address these problems with PrEference Appraisal Reinforcement Learning (PEARL), which solves Preference Balancing Tasks (PBTs). PBTs define a problem as a set of preferences that the system must balance to achieve a goal. The method is appropriate for acceleration-controlled systems with continuous state-space and either discrete or continuous action spaces with unknown system dynamics. We show that PEARL learns a sub-optimal policy on a subset of states and actions, and transfers the policy to the expanded domain to produce a more refined plan on a class of robotic problems. We establish convergence to task goal conditions, and even when preconditions are not verifiable, show that this is a valuable method to use before other more expensive approaches. Evaluation is done on several robotic problems, such as Aerial Cargo Delivery, Multi-Agent Pursuit, Rendezvous, and Inverted Flying Pendulum both in simulation and experimentally. Additionally, PEARL is leveraged outside of robotics as an array sorting agent. The results demonstrate high accuracy and fast learning times on a large set of practical applications

    Optimal temporal logic control of autonomous vehicles

    Full text link
    Thesis (Ph.D.)--Boston UniversityTemporal logics, such as Linear Temporal Logic (LTL) and Computation Tree Logic (CTL), are extensions of propositional logic that can capture temporal relations. Even though temporal logics have been used in model checking of finite systems for quite some time, they have gained popularity as a means for specifying complex mission requirements in path planning and control synthesis problems only recently. This dissertation proposes and evaluates methods and algorithms for optimal path planning and control synthesis for autonomous vehicles where a high-level mission specification expressed in LTL (or a fragment of LTL) must be satisfied. In summary, after obtaining a discrete representation of the overall system, ideas and tools from formal verification and graph theory are leveraged to synthesize provably correct and optimal control strategies. The first part of this dissertation focuses on automatic planning of optimal paths for a group of robots that must satisfy a common high level mission specification. The effect of slight deviations in traveling times on the behavior of the team is analyzed and methods that are robust to bounded non-determinism in traveling times are proposed. The second part focuses on the case where a controllable agent is required to satisfy a high-level mission specification in the presence of other probabilistic agents that cannot be controlled. Efficient methods to synthesize control policies that maximize the probability of satisfaction of the mission specification are presented. The focus of the third part is the problem where an autonomous vehicle is required to satisfy a rich mission specification over service requests occurring at the regions of a partitioned environment. A receding horizon control strategy that makes use of the local information provided by the sensors on the vehicle in addition to the a priori information about the environment is presented. For all of the automatic planning and control synthesis problems that are considered, the proposed algorithms are implemented, evaluated, and validated through experiments and/or simulations

    Learning multi-robot coordination from demonstrations

    Full text link
    This paper develops a Distributed Differentiable Dynamic Game (DDDG) framework, which enables learning multi-robot coordination from demonstrations. We represent multi-robot coordination as a dynamic game, where the behavior of a robot is dictated by its own dynamics and objective that also depends on others' behavior. The coordination thus can be adapted by tuning the objective and dynamics of each robot. The proposed DDDG enables each robot to automatically tune its individual dynamics and objectives in a distributed manner by minimizing the mismatch between its trajectory and demonstrations. This process requires a new distributed design of the forward-pass, where all robots collaboratively seek Nash equilibrium behavior, and a backward-pass, where gradients are propagated via the communication graph. We test the DDDG in simulation with a team of quadrotors given different task configurations. The results demonstrate the capability of DDDG for learning multi-robot coordination from demonstrationsComment: 6 figure

    Reachability-based Trajectory Design

    Full text link
    Autonomous mobile robots have the potential to increase the availability and accessibility of goods and services throughout society. However, to enable public trust in such systems, it is critical to certify that they are safe. This requires formally specifying safety, and designing motion planning methods that can guarantee safe operation (note, this work is only concerned with planning, not perception). The typical paradigm to attempt to ensure safety is receding-horizon planning, wherein a robot creates a short plan, then executes it while creating its next short plan in an iterative fashion, allowing a robot to incorporate new sensor information over time. However, this requires a robot to plan in real time. Therefore, the key challenge in making safety guarantees lies in balancing performance (how quickly a robot can plan) and conservatism (how cautiously a robot behaves). Existing methods suffer from a tradeoff between performance and conservatism, which is rooted in the choice of model used describe a robot; accuracy typically comes at the price of computation speed. To address this challenge, this dissertation proposes Reachability-based Trajectory Design (RTD), which performs real-time, receding-horizon planning with a simplified planning model, and ensures safety by describing the model error using a reachable set of the robot. RTD begins with the offline design of a continuum of parameterized trajectories for the plan- ning model; each trajectory ends with a fail-safe maneuver such as braking to a stop. RTD then computes the robot’s Forward Reachable Set (FRS), which contains all points in workspace reach- able by the robot for each parameterized trajectory. Importantly, the FRS also contains the error model, since a robot can typically never track planned trajectories perfectly. Online (at runtime), the robot intersects the FRS with sensed obstacles to provably determine which trajectory plans could cause collisions. Then, the robot performs trajectory optimization over the remaining safe trajectories. If no new safe plan can be found, the robot can execute its previously-found fail-safe maneuver, enabling perpetual safety. This dissertation begins by presenting RTD as a theoretical framework, then presents three representations of a robot’s FRS, using (1) sums-of-squares (SOS) polynomial programming, (2) zonotopes (a special type of convex polytope), and (3) rotatotopes (a generalization of zonotopes that enable representing a robot’s swept volume). To enable real-time planning, this work also de- velops an obstacle representation that enables provable safety while treating obstacles as discrete, finite sets of points. The practicality of RTD is demonstrated on four different wheeled robots (using the SOS FRS), two quadrotor aerial robots (using the zonotope FRS), and one manipulator robot (using the rotatotope FRS). Over thousands of simulations and dozens of hardware trials, RTD performs safe, real-time planning in arbitrary and challenging environments. In summary, this dissertation proposes RTD as a general purpose, practical framework for provably safe, real-time robot motion planning.PHDMechanical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/162884/1/skousik_1.pd

    Autonomous Drone Racing with Deep Reinforcement Learning

    Full text link
    In many robotic tasks, such as drone racing, the goal is to travel through a set of waypoints as fast as possible. A key challenge for this task is planning the minimum-time trajectory, which is typically solved by assuming perfect knowledge of the waypoints to pass in advance. The resulting solutions are either highly specialized for a single-track layout, or suboptimal due to simplifying assumptions about the platform dynamics. In this work, a new approach to minimum-time trajectory generation for quadrotors is presented. Leveraging deep reinforcement learning and relative gate observations, this approach can adaptively compute near-time-optimal trajectories for random track layouts. Our method exhibits a significant computational advantage over approaches based on trajectory optimization for non-trivial track configurations. The proposed approach is evaluated on a set of race tracks in simulation and the real world, achieving speeds of up to 17 m/s with a physical quadrotor

    Decentralized non-communicating multiagent collision avoidance with deep reinforcement learning

    Get PDF
    Finding feasible, collision-free paths for multiagent systems can be challenging, particularly in non-communicating scenarios where each agent's intent (e.g. goal) is unobservable to the others. In particular, finding time efficient paths often requires anticipating interaction with neighboring agents, the process of which can be computationally prohibitive. This work presents a decentralized multiagent collision avoidance algorithm based on a novel application of deep reinforcement learning, which effectively offloads the online computation (for predicting interaction patterns) to an offline learning procedure. Specifically, the proposed approach develops a value network that encodes the estimated time to the goal given an agent's joint configuration (positions and velocities) with its neighbors. Use of the value network not only admits efficient (i.e., real-time implementable) queries for finding a collision-free velocity vector, but also considers the uncertainty in the other agents' motion. Simulation results show more than 26% improvement in paths quality (i.e., time to reach the goal) when compared with optimal reciprocal collision avoidance (ORCA), a state-of-the-art collision avoidance strategy.Ford Motor Compan

    Decentralized non-communicating multiagent collision avoidance with deep reinforcement learning

    Get PDF
    Finding feasible, collision-free paths for multiagent systems can be challenging, particularly in non-communicating scenarios where each agent's intent (e.g. goal) is unobservable to the others. In particular, finding time efficient paths often requires anticipating interaction with neighboring agents, the process of which can be computationally prohibitive. This work presents a decentralized multiagent collision avoidance algorithm based on a novel application of deep reinforcement learning, which effectively offloads the online computation (for predicting interaction patterns) to an offline learning procedure. Specifically, the proposed approach develops a value network that encodes the estimated time to the goal given an agent's joint configuration (positions and velocities) with its neighbors. Use of the value network not only admits efficient (i.e., real-time implementable) queries for finding a collision-free velocity vector, but also considers the uncertainty in the other agents' motion. Simulation results show more than 26% improvement in paths quality (i.e., time to reach the goal) when compared with optimal reciprocal collision avoidance (ORCA), a state-of-the-art collision avoidance strategy.Ford Motor Compan
    • …
    corecore