27,993 research outputs found

    Automata guided hierarchical reinforcement learning for zero-shot skill composition

    Full text link
    An obstacle that prevents the wide adoption of (deep) reinforcement learning (RL) in control systems is its need for a large amount of interactions with the environment in order to master a skill. The learned skill usually generalizes poorly across domains and re-training is often necessary when presented with a new task. We present a framework that combines methods in formal methods with hierarchical reinforcement learning (HRL). The set of techniques we provide allows for convenient specification of tasks with complex logic, learn hierarchical policies (meta-controller and low-level controllers) with well-defined intrinsic rewards using any RL methods and is able to construct new skills from existing ones without additional learning. We evaluate the proposed methods in a simple grid world simulation as well as simulation on a Baxter robot

    Formal methods paradigms for estimation and machine learning in dynamical systems

    Get PDF
    Formal methods are widely used in engineering to determine whether a system exhibits a certain property (verification) or to design controllers that are guaranteed to drive the system to achieve a certain property (synthesis). Most existing techniques require a large amount of accurate information about the system in order to be successful. The methods presented in this work can operate with significantly less prior information. In the domain of formal synthesis for robotics, the assumptions of perfect sensing and perfect knowledge of system dynamics are unrealistic. To address this issue, we present control algorithms that use active estimation and reinforcement learning to mitigate the effects of uncertainty. In the domain of cyber-physical system analysis, we relax the assumption that the system model is known and identify system properties automatically from execution data. First, we address the problem of planning the path of a robot under temporal logic constraints (e.g. "avoid obstacles and periodically visit a recharging station") while simultaneously minimizing the uncertainty about the state of an unknown feature of the environment (e.g. locations of fires after a natural disaster). We present synthesis algorithms and evaluate them via simulation and experiments with aerial robots. Second, we develop a new specification language for tasks that require gathering information about and interacting with a partially observable environment, e.g. "Maintain localization error below a certain level while also avoiding obstacles.'' Third, we consider learning temporal logic properties of a dynamical system from a finite set of system outputs. For example, given maritime surveillance data we wish to find the specification that corresponds only to those vessels that are deemed law-abiding. Algorithms for performing off-line supervised and unsupervised learning and on-line supervised learning are presented. Finally, we consider the case in which we want to steer a system with unknown dynamics to satisfy a given temporal logic specification. We present a novel reinforcement learning paradigm to solve this problem. Our procedure gives "partial credit'' for executions that almost satisfy the specification, which can lead to faster convergence rates and produce better solutions when the specification is not satisfiable

    Integrating model checking with HiP-HOPS in model-based safety analysis

    Get PDF
    The ability to perform an effective and robust safety analysis on the design of modern safety–critical systems is crucial. Model-based safety analysis (MBSA) has been introduced in recent years to support the assessment of complex system design by focusing on the system model as the central artefact, and by automating the synthesis and analysis of failure-extended models. Model checking and failure logic synthesis and analysis (FLSA) are two prominent MBSA paradigms. Extensive research has placed emphasis on the development of these techniques, but discussion on their integration remains limited. In this paper, we propose a technique in which model checking and Hierarchically Performed Hazard Origin and Propagation Studies (HiP-HOPS) – an advanced FLSA technique – can be applied synergistically with benefit for the MBSA process. The application of the technique is illustrated through an example of a brake-by-wire system

    Multi-agent persistent surveillance under temporal logic constraints

    Full text link
    This thesis proposes algorithms for the deployment of multiple autonomous agents for persistent surveillance missions requiring repeated, periodic visits to regions of interest. Such problems arise in a variety of domains, such as monitoring ocean conditions like temperature and algae content, performing crowd security during public events, tracking wildlife in remote or dangerous areas, or watching traffic patterns and road conditions. Using robots for surveillance is an attractive solution to scenarios in which fixed sensors are not sufficient to maintain situational awareness. Multi-agent solutions are particularly promising, because they allow for improved spatial and temporal resolution of sensor information. In this work, we consider persistent monitoring by teams of agents that are tasked with satisfying missions specified using temporal logic formulas. Such formulas allow rich, complex tasks to be specified, such as "visit regions A and B infinitely often, and if region C is visited then go to region D, and always avoid obstacles." The agents must determine how to satisfy such missions according to fuel, communication, and other constraints. Such problems are inherently difficult due to the typically infinite horizon, state space explosion from planning for multiple agents, communication constraints, and other issues. Therefore, computing an optimal solution to these problems is often infeasible. Instead, a balance must be struck between computational complexity and optimality. This thesis describes solution methods for two main classes of multi-agent persistent surveillance problems. First, it considers the class of problems in which persistent surveillance goals are captured entirely by TL constraints. Such problems require agents to repeatedly visit a set of surveillance regions in order to satisfy their mission. We present results for agents solving such missions with charging constraints, with noisy observations, and in the presence of adversaries. The second class of problems include an additional optimality criterion, such as minimizing uncertainty about the location of a target or maximizing sensor information among the team of agents. We present solution methods and results for such missions with a variety of optimality criteria based on information metrics. For both classes of problems, the proposed algorithms are implemented and evaluated via simulation, experiments with robots in a motion capture environment, or both

    Reactive task planning for multi-robot systems in partial known environment

    Get PDF
    openThe thesis investigates the planning and control problem for a group of mobile agents moving in a partially known workspace. A task will be assigned to each robot in the form of a linear temporal logic (LTL) formula. First an automaton-based method is introduced for the motion planning of a single agent, which guarantees the satisfaction of the assigned LTL task. Then a model-predictive controller considers state and input constraints leading the agent to a safe navigation. Based on a real scenario of a partial-known environment and agents can have only local sensing, two decentralized control strategies are proposed for online re-planning, which rely on a sampling-based algorithm. The first approach assumes local communication between agents, while the second one exploits a more general communication-free case. Finally, the human-in-the-loop scenario is considered, where a human may additionally take control of the agents, a mixed initiative controller is then implemented to prevent dangerous human behaviors while guarantee the satisfaction of the LTL specification. Using the developed ROS software package, several experiments were carried out to demonstrate the effectiveness and the potential applicability of the proposed strategies.The thesis investigates the planning and control problem for a group of mobile agents moving in a partially known workspace. A task will be assigned to each robot in the form of a linear temporal logic (LTL) formula. First an automaton-based method is introduced for the motion planning of a single agent, which guarantees the satisfaction of the assigned LTL task. Then a model-predictive controller considers state and input constraints leading the agent to a safe navigation. Based on a real scenario of a partial-known environment and agents can have only local sensing, two decentralized control strategies are proposed for online re-planning, which rely on a sampling-based algorithm. The first approach assumes local communication between agents, while the second one exploits a more general communication-free case. Finally, the human-in-the-loop scenario is considered, where a human may additionally take control of the agents, a mixed initiative controller is then implemented to prevent dangerous human behaviors while guarantee the satisfaction of the LTL specification. Using the developed ROS software package, several experiments were carried out to demonstrate the effectiveness and the potential applicability of the proposed strategies
    • …