50 research outputs found
Vehicle control from temporal logic specifications with probabilistic satisfaction guarantees
Thesis (Ph.D.)--Boston UniversityTemporal logics, such as Linear Temporal Logic (LTL) and Computation Tree Logic (CTL), have become increasingly popular for specifying complex mission specifications in motion planning and control synthesis problems. This dissertation proposes and evaluates methods and algorithms for synthesizing control strategies for different vehicle models from temporal logic specifications. Complex vehicle models that involve systems of differential equations evolving over continuous domains are considered. The goal is to synthesize control strategies that maximize the probability that the behavior of the system, in the presence of sensing and actuation noise, satisfies a given temporal logic specification.
The first part of this dissertation proposes an approach for designing a vehicle control strategy that maximizes the probability of accomplishing a motion specification given as a Probabilistic CTL (PCTL) formula. Two scenarios are examined. First, a threat-rich environment is considered when the motion of a vehicle in the environment is given as a finite transition system. Second, a noisy Dubins vehicle is considered. For both scenarios, the motion of the vehicle in the environment is modeled as a Markov Decision Process (MDP) and an approach for generating an optimal MDP control policy that maximizes the probability of satisfying the PCTL formula is introduced.
The second part of this dissertation introduces a human-supervised control synthesis method for a noisy Dubins vehicle such that the expected time to satisfy a PCTL formula is minimized, while maintaining the satisfaction probability above a given probability threshold. A method for abstracting the motion of the vehicle in the environment in the form of an MDP is presented. An algorithm for synthesizing an optimal MDP control policy is proposed. If the probability threshold cannot be satisfied with the initial specification, the presented framework revises the specifica- tion until the supervisor is satisfied with the revised specification and the satisfaction probability is above the threshold.
The third part of this dissertation focuses on the problem of stochastic control of a noisy differential drive mobile robot such that the probability of satisfying a time constrained specification, given as a Bounded LTL (BLTL) formula, is maximized. A method for mapping noisy sensor measurements to an MDP is introduced. Due to the size of the MDP, finding the exact solution is computationally too expensive. Correctness is traded for scalability, and an MDP control synthesis method based on Statistical Model Checking is introduced
Negotiating the Probabilistic Satisfaction of Temporal Logic Motion Specifications
We propose a human-supervised control synthesis method for a stochastic
Dubins vehicle such that the probability of satisfying a specification given as
a formula in a fragment of Probabilistic Computational Tree Logic (PCTL) over a
set of environmental properties is maximized. Under some mild assumptions, we
construct a finite approximation for the motion of the vehicle in the form of a
tree-structured Markov Decision Process (MDP). We introduce an efficient
algorithm, which exploits the tree structure of the MDP, for synthesizing a
control policy that maximizes the probability of satisfaction. For the proposed
PCTL fragment, we define the specification update rules that guarantee the
increase (or decrease) of the satisfaction probability. We introduce an
incremental algorithm for synthesizing an updated MDP control policy that
reuses the initial solution. The initial specification can be updated, using
the rules, until the supervisor is satisfied with both the updated
specification and the corresponding satisfaction probability. We propose an
offline and an online application of this method.Comment: 9 pages, 4 figures; The results in this paper were presented without
proofs in IEEE/RSJ International Conference on Intelligent Robots and Systems
November 3-7, 2013 at Tokyo Big Sight, Japa
Real Time Motion Planning for Path Coverage with Applications in Ocean Surveying
Ocean surveying is the acquisition of acoustic data representing various features of the seafloor and the water above it, including water depth, seafloor composition, the presence of fish, and more. Historically, this was a task performed solely by manned vessels, but with advances in robotics and sensor technology, autonomous surface vehicles (ASVs) with sonar equipment are beginning to supplement and replace their more costly crewed counterparts. The popularity of these vessels calls for advances in software to control them.
In this thesis we define the problem of path coverage to represent and generalize that of ocean surveying, and propose a real-time motion planning algorithm to solve it. We prove theorems of completeness and local asymptotic optimality regarding the proposed algorithm, and evaluate it in a simulated environment. We also discover a lack of robustness in the Dubins vehicle model when applied to real-time motion planning. We implement a model-predictive controller and other components for an autonomous surveying system, and evaluate it in simulation. The system documented in this thesis takes a step towards fully autonomous ocean surveying, and proposes further extensions that get even closer to that goal
Learning for Multi-robot Cooperation in Partially Observable Stochastic Environments with Macro-actions
This paper presents a data-driven approach for multi-robot coordination in
partially-observable domains based on Decentralized Partially Observable Markov
Decision Processes (Dec-POMDPs) and macro-actions (MAs). Dec-POMDPs provide a
general framework for cooperative sequential decision making under uncertainty
and MAs allow temporally extended and asynchronous action execution. To date,
most methods assume the underlying Dec-POMDP model is known a priori or a full
simulator is available during planning time. Previous methods which aim to
address these issues suffer from local optimality and sensitivity to initial
conditions. Additionally, few hardware demonstrations involving a large team of
heterogeneous robots and with long planning horizons exist. This work addresses
these gaps by proposing an iterative sampling based Expectation-Maximization
algorithm (iSEM) to learn polices using only trajectory data containing
observations, MAs, and rewards. Our experiments show the algorithm is able to
achieve better solution quality than the state-of-the-art learning-based
methods. We implement two variants of multi-robot Search and Rescue (SAR)
domains (with and without obstacles) on hardware to demonstrate the learned
policies can effectively control a team of distributed robots to cooperate in a
partially observable stochastic environment.Comment: Accepted to the 2017 IEEE/RSJ International Conference on Intelligent
Robots and Systems (IROS 2017
Recommended from our members
Approaches to Safety in Inverse Reinforcement Learning
As the capabilities of robotic systems increase, we move closer to the vision of ubiquitous robotic assistance throughout our everyday lives. In transitioning robots and autonomous systems from traditional factory and industrial settings, it is critical that these systems are able to adapt to uncertain environments and the humans who populate them. In order to better understand and predict the behavior of these humans, Inverse Reinforcement Learning (IRL) uses demonstrations to infer the underlying motivations driving human actions. The information gained from IRL can be used to improve a robot’s understanding of the environment as well as to allow the robot to better interact with or assist humans.In this dissertation, we address the challenge of incorporating safety into the application of IRL. We first consider safety in the context of using IRL for assisting humans in shared control tasks. Through a user study, we show how incorporating haptic feedback into human assistance can increase humans’ sense of control while improving safety in the presence of imperfect learning. Further, we present our method for using IRL to automatically create such haptic feedback policies from task demonstrations.We further address safety in IRL by incorporating notions of safety directly into the learning process. Currently, most work on IRL focuses on learning explanatory rewards that humans are modeled as optimizing. However, pure reward optimization can fail to effectively capture hard requirements, such as safety constraints. We draw on the definition of safety from Hamilton-Jacobi reachability analysis to infer human perceptions of safety and to modify robot behavior to respect these learned safety constraints. We also extend this work on learning constraints by adapting the framework of Maximum Entropy IRL in order to learn hard constraints given nominal task rewards, and we show how this technique infers the most likely constraints to align expected behavior with observed demonstrations
Generalized Sampling-Based Feedback Motion Planners
The motion planning problem can be formulated as a Markov decision process (MDP), if the uncertainties in the robot motion and environments can be modeled probabilistically. The complexity of solving these MDPs grow exponentially as the dimension of the problem increases and hence, it is nearly impossible to solve the problem even without constraints. Using hierarchical methods, these MDPs can be transformed into a semi-Markov decision process (SMDP) which only needs to be solved at certain landmark states. In the deterministic robotics motion planning community, sampling based algorithms like probabilistic roadmaps (PRM) and rapidly exploring random trees (RRTs) have been successful in solving very high dimensional deterministic problem. However they are not robust to system with uncertainties in the system dynamics and hence, one of the primary objective of this work is to generalize PRM/RRT to solve motion planning with uncertainty.
We first present generalizations of randomized sampling based algorithms PRM and RRT, to incorporate the process uncertainty, and obstacle location uncertainty, termed as "generalized PRM" (GPRM) and "generalized RRT" (GRRT). The controllers used at the lower level of these planners are feedback controllers which ensure convergence of trajectories while mitigating the effects of process uncertainty. The results indicate that the algorithms solve the motion planning problem for a single agent in continuous state/control spaces in the presence of process uncertainty, and constraints such as obstacles and other state/input constraints.
Secondly, a novel adaptive sampling technique, termed as "adaptive GPRM" (AGPRM), is proposed for these generalized planners to increase the efficiency and overall success probability of these planners. It was implemented on high-dimensional robot n-link manipulators, with up to 8 links, i.e. in a 16-dimensional state-space. The results demonstrate the ability of the proposed algorithm to handle the motion planning problem for highly non-linear systems in very high-dimensional state space.
Finally, a solution methodology, termed the "multi-agent AGPRM" (MAGPRM), is proposed to solve the multi-agent motion planning problem under uncertainty. The technique uses a existing solution technique to the multiple traveling salesman problem (MTSP) in conjunction with GPRM. For real-time implementation, an ?inter-agent collision detection and avoidance? module was designed which ensures that no two agents collide at any time-step. Algorithm was tested on teams of homogeneous and heterogeneous agents in cluttered obstacle space and the algorithm demonstrate the ability to handle such problems in continuous state/control spaces in presence of process uncertainty
Recommended from our members
Game-Theoretic Safety Assurance for Human-Centered Robotic Systems
In order for autonomous systems like robots, drones, and self-driving cars to be reliably introduced into our society, they must have the ability to actively account for safety during their operation. While safety analysis has traditionally been conducted offline for controlled environments like cages on factory floors, the much higher complexity of open, human-populated spaces like our homes, cities, and roads makes it unviable to rely on common design-time assumptions, since these may be violated once the system is deployed. Instead, the next generation of robotic technologies will need to reason about safety online, constructing high-confidence assurances informed by ongoing observations of the environment and other agents, in spite of models of them being necessarily fallible.This dissertation aims to lay down the necessary foundations to enable autonomous systems to ensure their own safety in complex, changing, and uncertain environments, by explicitly reasoning about the gap between their models and the real world. It first introduces a suite of novel robust optimal control formulations and algorithmic tools that permit tractable safety analysis in time-varying, multi-agent systems, as well as safe real-time robotic navigation in partially unknown environments; these approaches are demonstrated on large-scale unmanned air traffic simulation and physical quadrotor platforms. After this, it draws on Bayesian machine learning methods to translate model-based guarantees into high-confidence assurances, monitoring the reliability of predictive models in light of changing evidence about the physical system and surrounding agents. This principle is first applied to a general safety framework allowing the use of learning-based control (e.g. reinforcement learning) for safety-critical robotic systems such as drones, and then combined with insights from cognitive science and dynamic game theory to enable safe human-centered navigation and interaction; these techniques are showcased on physical quadrotors—flying in unmodeled wind and among human pedestrians—and simulated highway driving. The dissertation ends with a discussion of challenges and opportunities ahead, including the bridging of safety analysis and reinforcement learning and the need to ``close the loop'' around learning and adaptation in order to deploy increasingly advanced autonomous systems with confidence
Threat assessment for safe navigation in environments with uncertainty in predictability
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Aeronautics and Astronautics, 2011.Cataloged from PDF version of thesis.Includes bibliographical references (p. 213-224).This thesis develops threat assessment algorithms to improve the safety of the decision making of autonomous and human-operated vehicles navigating in dynamic and uncertain environments, where the source of uncertainty is in the predictability of the nearby vehicles' future trajectories. The first part of the thesis introduces two classes of algorithms to classify drivers behaviors at roads intersections based on Support Vector Machines (SVM) and Hidden Markov Models (HMM). These algorithms are successfully validated using a large real-world intersection dataset, and can be used as part of future driver assistance systems. They are also compared to three popular traditional methods, and the results show significant and consistent improvements with the developed algorithms. The second part of the thesis presents an efficient trajectory prediction algorithm that has been developed to improve the performance of future collision avoidance and detection systems. The proposed approach, RR-GP, combines the Rapidly-exploring Random Trees (RRT) based algorithm, RRT-Reach, with mixtures of Gaussian Processes (GP) to compute dynamically feasible paths, in real-time, while embedding the flexibility of GP's nonparametric Bayesian model. RR-GP efficiently approximates the reachability sets of surrounding vehicles, and is shown in simulation and on naturalistic data to improve the performance over two standard GP-based algorithms. The third part introduces new path planning algorithms that build upon the tools that have been previously introduced in this thesis. The focus is on safe autonomous navigation in the presence of other vehicles with uncertain motion patterns. First, it presents a new threat assessment module (TAM) that combines the RRT-Reach algorithm with an SVM-based intention predictor, to develop a threat-aware path planner. The strengths of this approach are demonstrated through simulation and experiments performed in the MIT RAVEN testbed. Second, another novel path planning technique is developed by integrating the RR-GP trajectory prediction algorithm with a state-of-the-art chance-constrained RRT planner. This framework provides several theoretical guarantees on the probabilistic satisfaction of collision avoidance constraints. Extensive simulation results show that the resulting approach can be used in real-time to efficiently and accurately execute safe paths. The last part of the thesis considers the decision-making problem for a human-driven vehicle crossing a road intersection in the presence of other, potentially errant, drivers. The proposed approach uses the TAM framework to compute the threat level in real-time, and provides the driver with a warning signal and the best escape maneuver through the intersection. Experimental results with small autonomous and human-driven vehicles in the RAVEN testbed demonstrate that this approach can be successfully used in real-time to minimize the risk of collision in urban-like environments.by Georges Salim Aoudé.Ph.D