6,116 research outputs found
End-to-End Safe Reinforcement Learning through Barrier Functions for Safety-Critical Continuous Control Tasks
Reinforcement Learning (RL) algorithms have found limited success beyond
simulated applications, and one main reason is the absence of safety guarantees
during the learning process. Real world systems would realistically fail or
break before an optimal controller can be learned. To address this issue, we
propose a controller architecture that combines (1) a model-free RL-based
controller with (2) model-based controllers utilizing control barrier functions
(CBFs) and (3) on-line learning of the unknown system dynamics, in order to
ensure safety during learning. Our general framework leverages the success of
RL algorithms to learn high-performance controllers, while the CBF-based
controllers both guarantee safety and guide the learning process by
constraining the set of explorable polices. We utilize Gaussian Processes (GPs)
to model the system dynamics and its uncertainties.
Our novel controller synthesis algorithm, RL-CBF, guarantees safety with high
probability during the learning process, regardless of the RL algorithm used,
and demonstrates greater policy exploration efficiency. We test our algorithm
on (1) control of an inverted pendulum and (2) autonomous car-following with
wireless vehicle-to-vehicle communication, and show that our algorithm attains
much greater sample efficiency in learning than other state-of-the-art
algorithms and maintains safety during the entire learning process.Comment: Published in AAAI 201
A decentralized motion coordination strategy for dynamic target tracking
This paper presents a decentralized motion planning
algorithm for the distributed sensing of a noisy dynamical
process by multiple cooperating mobile sensor agents. This
problem is motivated by localization and tracking tasks of
dynamic targets. Our gradient-descent method is based on a
cost function that measures the overall quality of sensing. We
also investigate the role of imperfect communication between
sensor agents in this framework, and examine the trade-offs in
performance between sensing and communication. Simulations
illustrate the basic characteristics of the algorithms
Modelling and experimental investigation of carangiform locomotion for control
We propose a model for planar carangiform swimming based on conservative equations for the interaction of a rigid body and an incompressible fluid. We account for the generation of thrust due to vortex shedding through controlled coupling terms. We investigate the correct form of this coupling experimentally with a robotic propulsor, comparing its observed behavior to that predicted by unsteady hydrodynamics. Our analysis of thrust generation by an oscillating hydrofoil allows us to characterize and evaluate certain families of gaits. Our final swimming model takes the form of a control-affine nonlinear system
High-temperature scaling limit for directed polymers on a hierarchical lattice with bond disorder
Diamond "lattices" are sequences of recursively-defined graphs that provide a
network of directed pathways between two fixed root nodes, and . The
construction recipe for diamond graphs depends on a branching number and a segmenting number , for which a larger value
of the ratio intuitively corresponds to more opportunities for
intersections between two randomly chosen paths. By attaching i.i.d. random
variables to the bonds of the graphs, I construct a random Gibbs measure on the
set of directed paths by assigning each path an "energy" given by summing the
random variables along the path. For the case , I propose a scaling regime
in which the temperature grows along with the number of hierarchical layers of
the graphs, and the partition function (the normalization factor of the Gibbs
measure) appears to converge in law. I prove that all of the positive integer
moments of the partition function converge in this limiting regime. The
motivation of this work is to prove a functional limit theorem that is
analogous to a previous result obtained in the case.Comment: 28 pages, 1 figur
Situational reasoning for road driving in an urban environment
Robot navigation in urban environments requires situational reasoning.
Given the complexity of the environment and the behavior specified by traffic
rules, it is necessary to recognize the current situation to impose the correct
traffic rules. In an attempt to manage the complexity of the situational reasoning
subsystem, this paper describes a finite state machine model to govern the situational
reasoning process. The logic state machine and its interaction with the
planning system are discussed. The approach was implemented on Alice, Team
Caltech’s entry into the 2007 DARPA Urban Challenge. Results from the qualifying
rounds are discussed. The approach is validated and the shortcomings of
the implementation are identified
Linear models for control of cavity flow oscillations
Models for understanding and controlling oscillations in the flow past a rectangular cavity are developed. These models may be used to guide control designs, to understand performance limits of feedback, and to interpret experimental results. Traditionally, cavity oscillations are assumed to be self-sustained: no external disturbances are necessary to maintain the oscillations, and amplitudes are limited by nonlinearities. We present experimental data which suggests that in some regimes, the oscillations may not be self-sustained, but lightly damped: oscillations are sustained by external forcing, such as boundary-layer turbulence. In these regimes, linear models suffice to describe the behaviour, and the final amplitude of oscillations depends on the characteristics of the external disturbances. These linear models are particularly appropriate for describing cavities in which feedback has been used for noise suppression, as the oscillations are small and nonlinearities are less likely to be important. It is shown that increasing the gain too much in such feedback control experiments can lead to a peak-splitting phenomenon, which is explained by the linear models. Fundamental performance limits indicate that peak splitting is likely to occur for narrow-bandwidth actuators and controllers
POD Based Models of Self-Sustained Oscillations in the Flow Past an Open Cavity
The goal of this work is to provide accurate dynamical models of oscillations in the flow past a rectangular cavity, for the purpose of bifurcation analysis and control. We have performed an extensive set of direct numerical simulations which provide the data used to derive and evaluate the models. Based on the method of Proper Orthogonal Decomposition (POD) and Galerkin projection, we obtain low-order models (from 6 to 60 states) which capture the dynamics very accurately over a few periods of oscillation, but deviate for long time
Comparison of Rhizon Sampling and Whole Round Squeezing for Marine Sediment Porewater
The collection and chemical analysis of sedimentary porewater is central to many marine studies. Porewater alkalinity,dissolved inorganic carbon (DIC), sulfate, nitrate, and other dissolved ions are used to identify and determine rates of geochemical reactions and microbial respiration pathways, such as sulfate reduction and denitrification (Froelich et al., 1979; Berner, 1980; Gieskes et al., 1986; D’Hondt et al., 2004; Schulz, 2006; Martin and Sayles, 2007). Ammonium is critical for understanding microbial respiration and the nitrogen cycle (Blackburn, 1988). Chloride is used to reconstruct ocean salinity variations, constrain flow rates, and estimate gas hydrate concentrations (Paull et al., 1996; Adkins et al., 2002; Spivack et al., 2002). Each of these studies requires the recovery of porewater that is not compromised by sampling artifacts
- …