6,116 research outputs found

    End-to-End Safe Reinforcement Learning through Barrier Functions for Safety-Critical Continuous Control Tasks

    Get PDF
    Reinforcement Learning (RL) algorithms have found limited success beyond simulated applications, and one main reason is the absence of safety guarantees during the learning process. Real world systems would realistically fail or break before an optimal controller can be learned. To address this issue, we propose a controller architecture that combines (1) a model-free RL-based controller with (2) model-based controllers utilizing control barrier functions (CBFs) and (3) on-line learning of the unknown system dynamics, in order to ensure safety during learning. Our general framework leverages the success of RL algorithms to learn high-performance controllers, while the CBF-based controllers both guarantee safety and guide the learning process by constraining the set of explorable polices. We utilize Gaussian Processes (GPs) to model the system dynamics and its uncertainties. Our novel controller synthesis algorithm, RL-CBF, guarantees safety with high probability during the learning process, regardless of the RL algorithm used, and demonstrates greater policy exploration efficiency. We test our algorithm on (1) control of an inverted pendulum and (2) autonomous car-following with wireless vehicle-to-vehicle communication, and show that our algorithm attains much greater sample efficiency in learning than other state-of-the-art algorithms and maintains safety during the entire learning process.Comment: Published in AAAI 201

    A decentralized motion coordination strategy for dynamic target tracking

    Get PDF
    This paper presents a decentralized motion planning algorithm for the distributed sensing of a noisy dynamical process by multiple cooperating mobile sensor agents. This problem is motivated by localization and tracking tasks of dynamic targets. Our gradient-descent method is based on a cost function that measures the overall quality of sensing. We also investigate the role of imperfect communication between sensor agents in this framework, and examine the trade-offs in performance between sensing and communication. Simulations illustrate the basic characteristics of the algorithms

    Modelling and experimental investigation of carangiform locomotion for control

    Get PDF
    We propose a model for planar carangiform swimming based on conservative equations for the interaction of a rigid body and an incompressible fluid. We account for the generation of thrust due to vortex shedding through controlled coupling terms. We investigate the correct form of this coupling experimentally with a robotic propulsor, comparing its observed behavior to that predicted by unsteady hydrodynamics. Our analysis of thrust generation by an oscillating hydrofoil allows us to characterize and evaluate certain families of gaits. Our final swimming model takes the form of a control-affine nonlinear system

    High-temperature scaling limit for directed polymers on a hierarchical lattice with bond disorder

    Full text link
    Diamond "lattices" are sequences of recursively-defined graphs that provide a network of directed pathways between two fixed root nodes, AA and BB. The construction recipe for diamond graphs depends on a branching number bNb\in \mathbb{N} and a segmenting number sNs\in \mathbb{N}, for which a larger value of the ratio s/bs/b intuitively corresponds to more opportunities for intersections between two randomly chosen paths. By attaching i.i.d. random variables to the bonds of the graphs, I construct a random Gibbs measure on the set of directed paths by assigning each path an "energy" given by summing the random variables along the path. For the case b=sb=s, I propose a scaling regime in which the temperature grows along with the number of hierarchical layers of the graphs, and the partition function (the normalization factor of the Gibbs measure) appears to converge in law. I prove that all of the positive integer moments of the partition function converge in this limiting regime. The motivation of this work is to prove a functional limit theorem that is analogous to a previous result obtained in the b<sb<s case.Comment: 28 pages, 1 figur

    Situational reasoning for road driving in an urban environment

    Get PDF
    Robot navigation in urban environments requires situational reasoning. Given the complexity of the environment and the behavior specified by traffic rules, it is necessary to recognize the current situation to impose the correct traffic rules. In an attempt to manage the complexity of the situational reasoning subsystem, this paper describes a finite state machine model to govern the situational reasoning process. The logic state machine and its interaction with the planning system are discussed. The approach was implemented on Alice, Team Caltech’s entry into the 2007 DARPA Urban Challenge. Results from the qualifying rounds are discussed. The approach is validated and the shortcomings of the implementation are identified

    Linear models for control of cavity flow oscillations

    Get PDF
    Models for understanding and controlling oscillations in the flow past a rectangular cavity are developed. These models may be used to guide control designs, to understand performance limits of feedback, and to interpret experimental results. Traditionally, cavity oscillations are assumed to be self-sustained: no external disturbances are necessary to maintain the oscillations, and amplitudes are limited by nonlinearities. We present experimental data which suggests that in some regimes, the oscillations may not be self-sustained, but lightly damped: oscillations are sustained by external forcing, such as boundary-layer turbulence. In these regimes, linear models suffice to describe the behaviour, and the final amplitude of oscillations depends on the characteristics of the external disturbances. These linear models are particularly appropriate for describing cavities in which feedback has been used for noise suppression, as the oscillations are small and nonlinearities are less likely to be important. It is shown that increasing the gain too much in such feedback control experiments can lead to a peak-splitting phenomenon, which is explained by the linear models. Fundamental performance limits indicate that peak splitting is likely to occur for narrow-bandwidth actuators and controllers

    POD Based Models of Self-Sustained Oscillations in the Flow Past an Open Cavity

    Get PDF
    The goal of this work is to provide accurate dynamical models of oscillations in the flow past a rectangular cavity, for the purpose of bifurcation analysis and control. We have performed an extensive set of direct numerical simulations which provide the data used to derive and evaluate the models. Based on the method of Proper Orthogonal Decomposition (POD) and Galerkin projection, we obtain low-order models (from 6 to 60 states) which capture the dynamics very accurately over a few periods of oscillation, but deviate for long time

    Comparison of Rhizon Sampling and Whole Round Squeezing for Marine Sediment Porewater

    Get PDF
    The collection and chemical analysis of sedimentary porewater is central to many marine studies. Porewater alkalinity,dissolved inorganic carbon (DIC), sulfate, nitrate, and other dissolved ions are used to identify and determine rates of geochemical reactions and microbial respiration pathways, such as sulfate reduction and denitrification (Froelich et al., 1979; Berner, 1980; Gieskes et al., 1986; D’Hondt et al., 2004; Schulz, 2006; Martin and Sayles, 2007). Ammonium is critical for understanding microbial respiration and the nitrogen cycle (Blackburn, 1988). Chloride is used to reconstruct ocean salinity variations, constrain flow rates, and estimate gas hydrate concentrations (Paull et al., 1996; Adkins et al., 2002; Spivack et al., 2002). Each of these studies requires the recovery of porewater that is not compromised by sampling artifacts
    corecore