72 research outputs found

    Reinforcement Learning With High-Level Task Specifications

    Get PDF
    Reinforcement learning (RL) has been widely used, for example, in robotics, recommendation systems, and financial services. Existing RL algorithms typically optimize reward-based surrogates rather than the task performance itself. Therefore, they suffer from several shortcomings in providing guarantees for the task performance of the learned policies: An optimal policy for a surrogate objective may not have optimal task performance. A reward function that helps achieve satisfactory task performance in one environment may not transfer well to another environment. RL algorithms tackle nonlinear and nonconvex optimization problems and may, in general, not able to find globally optimal policies. The goal of this dissertation is to develop RL algorithms that explicitly account for formal high-level task specifications and equip the learned policies with provable guarantees for the satisfaction of these specifications. The resulting RL and inverse RL algorithms utilize multiple representations of task specifications, including conventional reward functions, expert demonstrations, temporal logic formulas, trajectory-based constraint functions as well as their combinations. These algorithms offer several promising capabilities. First, they automatically generate a memory transition system, which is critical for tasks that cannot be implemented by memoryless policies. Second, the formal specifications can act as reliable performance criteria for the learned policies despite the quality of the designed reward functions and variations in the underlying environments. Third, the algorithms enable online RL that never violates critical task and safety requirements, even during exploration

    Accelerated Risk Assessment And Domain Adaptation For Autonomous Vehicles

    Get PDF
    Autonomous vehicles (AVs) are already driving on public roads around the US; however, their rate of deployment far outpaces quality assurance and regulatory efforts. Consequently, even the most elementary tasks, such as automated lane keeping, have not been certified for safety, and operations are constrained to narrow domains. First, due to the limitations of worst-case analysis techniques, we hypothesize that new methods must be developed to quantify and bound the risk of AVs. Counterintuitively, the better the performance of the AV under consideration, the harder it is to accurately estimate its risk as failures become rare and difficult to sample. This thesis presents a new estimation procedure and framework that can efficiently evaluate and AV\u27s risk even in the rare event regime. We demonstrate the approach\u27s performance on a variety of AV software stacks. Second, given a framework for AV evaluation, we turn to a related question: how can AV software be efficiently adapted for new or expanded operating conditions? We hypothesize that stochastic search techniques can improve the naive trial-and-error approach commonly used today. One of the most challenging aspects of this task is that proficient driving requires making tradeoffs between performance and safety. Moreover, for novel scenarios or operational domains there may be little data that can be used to understand the behavior of other drivers. To study these challenges we create a low-cost scale platform, simulator, benchmarks, and baseline solutions. Using this testbed, we develop a new population-based self-play method for creating dynamic actors and detail both offline and online procedures for adapting AV components to these conditions. Taken as a whole, this work represents a rigorous approach to the evaluation and improvement of AV software

    A Posture Sequence Learning System for an Anthropomorphic Robotic Hand

    Get PDF
    The paper presents a cognitive architecture for posture learning of an anthropomorphic robotic hand. Our approach is aimed to allow the robotic system to perform complex perceptual operations, to interact with a human user and to integrate the perceptions by a cognitive representation of the scene and the observed actions. The anthropomorphic robotic hand imitates the gestures acquired by the vision system in order to learn meaningful movements, to build its knowledge by different conceptual spaces and to perform complex interaction with the human operator

    Assuring Safety under Uncertainty in Learning-Based Control Systems

    Get PDF
    Learning-based controllers have recently shown impressive results for different robotic tasks in well-defined environments, successfully solving a Rubiks cube and sorting objects in a bin. These advancements promise to enable a host of new capabilities for complex robotic systems. However, these learning-based controllers cannot yet be deployed in highly uncertain environments due to significant issues relating to learning reliability, robustness, and safety. To overcome these issues, this thesis proposes new methods for integrating model information (e.g. model-based control priors) into the reinforcement learning framework, which is crucial to ensuring reliability and safety. I show, both empirically and theoretically, that this model information greatly reduces variance in learning and can effectively constrain the policy search space, thus enabling significant improvements in sample complexity for the underlying RL algorithms. Furthermore, by leveraging control barrier functions and Gaussian process uncertainty models, I show how system safety can be maintained under uncertainty without interfering with the learning process (e.g. distorting the policy gradients). The last part of the thesis will discuss fundamental limitations that arise when utilizing machine learning to derive safety guarantees. In particular, I show that widely used uncertainty models can be highly inaccurate when predicting rare events, and examine the implications of this for safe learning. To overcome some of these limitations, a novel framework is developed based on assume-guarantee contracts in order to ensure safety in multi-agent human environments. The proposed approach utilizes contracts to impose loose responsibilities on agents in the environment, which are learned from data. Imposing these responsibilities on agents, rather than treating their uncertainty as a purely random process, allows us to achieve both safety and efficiency in interactions.</p

    Computer Aided Verification

    Get PDF
    This open access two-volume set LNCS 10980 and 10981 constitutes the refereed proceedings of the 30th International Conference on Computer Aided Verification, CAV 2018, held in Oxford, UK, in July 2018. The 52 full and 13 tool papers presented together with 3 invited papers and 2 tutorials were carefully reviewed and selected from 215 submissions. The papers cover a wide range of topics and techniques, from algorithmic and logical foundations of verification to practical applications in distributed, networked, cyber-physical, and autonomous systems. They are organized in topical sections on model checking, program analysis using polyhedra, synthesis, learning, runtime verification, hybrid and timed systems, tools, probabilistic systems, static analysis, theory and security, SAT, SMT and decisions procedures, concurrency, and CPS, hardware, industrial applications

    Computer Aided Verification

    Get PDF
    This open access two-volume set LNCS 10980 and 10981 constitutes the refereed proceedings of the 30th International Conference on Computer Aided Verification, CAV 2018, held in Oxford, UK, in July 2018. The 52 full and 13 tool papers presented together with 3 invited papers and 2 tutorials were carefully reviewed and selected from 215 submissions. The papers cover a wide range of topics and techniques, from algorithmic and logical foundations of verification to practical applications in distributed, networked, cyber-physical, and autonomous systems. They are organized in topical sections on model checking, program analysis using polyhedra, synthesis, learning, runtime verification, hybrid and timed systems, tools, probabilistic systems, static analysis, theory and security, SAT, SMT and decisions procedures, concurrency, and CPS, hardware, industrial applications

    Tracking the Temporal-Evolution of Supernova Bubbles in Numerical Simulations

    Get PDF
    The study of low-dimensional, noisy manifolds embedded in a higher dimensional space has been extremely useful in many applications, from the chemical analysis of multi-phase flows to simulations of galactic mergers. Building a probabilistic model of the manifolds has helped in describing their essential properties and how they vary in space. However, when the manifold is evolving through time, a joint spatio-temporal modelling is needed, in order to fully comprehend its nature. We propose a first-order Markovian process that propagates the spatial probabilistic model of a manifold at fixed time, to its adjacent temporal stages. The proposed methodology is demonstrated using a particle simulation of an interacting dwarf galaxy to describe the evolution of a cavity generated by a Supernov

    BNAIC 2008:Proceedings of BNAIC 2008, the twentieth Belgian-Dutch Artificial Intelligence Conference

    Get PDF
    corecore