3,986 research outputs found
Verification for Machine Learning, Autonomy, and Neural Networks Survey
This survey presents an overview of verification techniques for autonomous
systems, with a focus on safety-critical autonomous cyber-physical systems
(CPS) and subcomponents thereof. Autonomy in CPS is enabling by recent advances
in artificial intelligence (AI) and machine learning (ML) through approaches
such as deep neural networks (DNNs), embedded in so-called learning enabled
components (LECs) that accomplish tasks from classification to control.
Recently, the formal methods and formal verification community has developed
methods to characterize behaviors in these LECs with eventual goals of formally
verifying specifications for LECs, and this article presents a survey of many
of these recent approaches
Safe exploration of nonlinear dynamical systems: A predictive safety filter for reinforcement learning
The transfer of reinforcement learning (RL) techniques into real-world
applications is challenged by safety requirements in the presence of physical
limitations. Most RL methods, in particular the most popular algorithms, do not
support explicit consideration of state and input constraints. In this paper,
we address this problem for nonlinear systems with continuous state and input
spaces by introducing a predictive safety filter, which is able to turn a
constrained dynamical system into an unconstrained safe system, to which any RL
algorithm can be applied `out-of-the-box'. The predictive safety filter
receives the proposed learning input and decides, based on the current system
state, if it can be safely applied to the real system, or if it has to be
modified otherwise. Safety is thereby established by a continuously updated
safety policy, which is based on a model predictive control formulation using a
data-driven system model and considering state and input dependent
uncertainties
Stability-certified reinforcement learning: A control-theoretic perspective
We investigate the important problem of certifying stability of reinforcement
learning policies when interconnected with nonlinear dynamical systems. We show
that by regulating the input-output gradients of policies, strong guarantees of
robust stability can be obtained based on a proposed semidefinite programming
feasibility problem. The method is able to certify a large set of stabilizing
controllers by exploiting problem-specific structures; furthermore, we analyze
and establish its (non)conservatism. Empirical evaluations on two decentralized
control tasks, namely multi-flight formation and power system frequency
regulation, demonstrate that the reinforcement learning agents can have high
performance within the stability-certified parameter space, and also exhibit
stable learning behaviors in the long run
Global Adaptive Dynamic Programming for Continuous-Time Nonlinear Systems
This paper presents a novel method of global adaptive dynamic programming
(ADP) for the adaptive optimal control of nonlinear polynomial systems. The
strategy consists of relaxing the problem of solving the
Hamilton-Jacobi-Bellman (HJB) equation to an optimization problem, which is
solved via a new policy iteration method. The proposed method distinguishes
from previously known nonlinear ADP methods in that the neural network
approximation is avoided, giving rise to significant computational improvement.
Instead of semiglobally or locally stabilizing, the resultant control policy is
globally stabilizing for a general class of nonlinear polynomial systems.
Furthermore, in the absence of the a priori knowledge of the system dynamics,
an online learning method is devised to implement the proposed policy iteration
technique by generalizing the current ADP theory. Finally, three numerical
examples are provided to validate the effectiveness of the proposed method.Comment: This is an updated version of the publication "Global Adaptive
Dynamic Programming for Continuous-Time Nonlinear Systems," in IEEE
Transactions on Automatic Control, vol. 60, no. 11, pp. 2917-2929, Nov. 2015.
Few typos have been fixed in this versio
Convexity and monotonicity in nonlinear optimal control under uncertainty
We consider the problem of finite-horizon optimal control design under
uncertainty for imperfectly observed discrete-time systems with convex costs
and constraints. It is known that this problem can be cast as an
infinite-dimensional convex program when the dynamics and measurements are
linear, uncertainty is additive, and the risks associated with constraint
violations and excessive costs are measured in expectation or in the worst
case. In this paper, we extend this result to systems with convex or concave
dynamics, nonlinear measurements, more general uncertainty structures and other
coherent risk measures. In this setting, the optimal control problem can be
cast as an infinite-dimensional convex program if (1) the costs, constraints
and dynamics satisfy certain monotonicity properties, and (2) the measured
outputs can be reversibly `purified' of the influence of the control inputs
through Q- or Youla-parameterization. The practical value of this result is
that the finite-dimensional subproblems arising in a variety of suboptimal
control methods, notably including model predictive control and the Q-design
procedure, are also convex for this class of nonlinear systems. Subproblems can
therefore be solved to global optimality using convenient modeling software and
efficient, reliable solvers. We illustrate these ideas in a numerical example
Cautious Model Predictive Control using Gaussian Process Regression
Gaussian process (GP) regression has been widely used in supervised machine
learning due to its flexibility and inherent ability to describe uncertainty in
function estimation. In the context of control, it is seeing increasing use for
modeling of nonlinear dynamical systems from data, as it allows the direct
assessment of residual model uncertainty. We present a model predictive control
(MPC) approach that integrates a nominal system with an additive nonlinear part
of the dynamics modeled as a GP. Approximation techniques for propagating the
state distribution are reviewed and we describe a principled way of formulating
the chance constrained MPC problem, which takes into account residual
uncertainties provided by the GP model to enable cautious control. Using
additional approximations for efficient computation, we finally demonstrate the
approach in a simulation example, as well as in a hardware implementation for
autonomous racing of remote controlled race cars, highlighting improvements
with regard to both performance and safety over a nominal controller.Comment: Published in IEEE Transactions on Control Systems Technolog
ABC-LMPC: Safe Sample-Based Learning MPC for Stochastic Nonlinear Dynamical Systems with Adjustable Boundary Conditions
Sample-based learning model predictive control (LMPC) strategies have
recently attracted attention due to their desirable theoretical properties and
their good empirical performance on robotic tasks. However, prior analysis of
LMPC controllers for stochastic systems has mainly focused on linear systems in
the iterative learning control setting. We present a novel LMPC algorithm,
Adjustable Boundary Condition LMPC (ABC-LMPC), which enables rapid adaptation
to novel start and goal configurations and theoretically show that the
resulting controller guarantees iterative improvement in expectation for
stochastic nonlinear systems. We present results with a practical instantiation
of this algorithm and experimentally demonstrate that the resulting controller
adapts to a variety of initial and terminal conditions on 3 stochastic
continuous control tasks.Comment: Workshop on the Algorithmic Foundations of Robotics (WAFR) 2020.
First two authors contributed equall
On the Sample Complexity of the Linear Quadratic Regulator
This paper addresses the optimal control problem known as the Linear
Quadratic Regulator in the case when the dynamics are unknown. We propose a
multi-stage procedure, called Coarse-ID control, that estimates a model from a
few experimental trials, estimates the error in that model with respect to the
truth, and then designs a controller using both the model and uncertainty
estimate. Our technique uses contemporary tools from random matrix theory to
bound the error in the estimation procedure. We also employ a recently
developed approach to control synthesis called System Level Synthesis that
enables robust control design by solving a convex optimization problem. We
provide end-to-end bounds on the relative error in control cost that are nearly
optimal in the number of parameters and that highlight salient properties of
the system to be controlled such as closed-loop sensitivity and optimal control
magnitude. We show experimentally that the Coarse-ID approach enables efficient
computation of a stabilizing controller in regimes where simple control schemes
that do not take the model uncertainty into account fail to stabilize the true
system.Comment: Contains a new analysis of finite-dimensional truncation, a new
data-dependent estimation bound, and an expanded exposition on necessary
background in control theory and System Level Synthesi
Bayesian Optimization with Safety Constraints: Safe and Automatic Parameter Tuning in Robotics
Robotic algorithms typically depend on various parameters, the choice of
which significantly affects the robot's performance. While an initial guess for
the parameters may be obtained from dynamic models of the robot, parameters are
usually tuned manually on the real system to achieve the best performance.
Optimization algorithms, such as Bayesian optimization, have been used to
automate this process. However, these methods may evaluate unsafe parameters
during the optimization process that lead to safety-critical system failures.
Recently, a safe Bayesian optimization algorithm, called SafeOpt, has been
developed, which guarantees that the performance of the system never falls
below a critical value; that is, safety is defined based on the performance
function. However, coupling performance and safety is often not desirable in
robotics. For example, high-gain controllers might achieve low average tracking
error (performance), but can overshoot and violate input constraints. In this
paper, we present a generalized algorithm that allows for multiple safety
constraints separate from the objective. Given an initial set of safe
parameters, the algorithm maximizes performance but only evaluates parameters
that satisfy safety for all constraints with high probability. To this end, it
carefully explores the parameter space by exploiting regularity assumptions in
terms of a Gaussian process prior. Moreover, we show how context variables can
be used to safely transfer knowledge to new situations and tasks. We provide a
theoretical analysis and demonstrate that the proposed algorithm enables fast,
automatic, and safe optimization of tuning parameters in experiments on a
quadrotor vehicle
A General Safety Framework for Learning-Based Control in Uncertain Robotic Systems
The proven efficacy of learning-based control schemes strongly motivates
their application to robotic systems operating in the physical world. However,
guaranteeing correct operation during the learning process is currently an
unresolved issue, which is of vital importance in safety-critical systems. We
propose a general safety framework based on Hamilton-Jacobi reachability
methods that can work in conjunction with an arbitrary learning algorithm. The
method exploits approximate knowledge of the system dynamics to guarantee
constraint satisfaction while minimally interfering with the learning process.
We further introduce a Bayesian mechanism that refines the safety analysis as
the system acquires new evidence, reducing initial conservativeness when
appropriate while strengthening guarantees through real-time validation. The
result is a least-restrictive, safety-preserving control law that intervenes
only when (a) the computed safety guarantees require it, or (b) confidence in
the computed guarantees decays in light of new observations. We prove
theoretical safety guarantees combining probabilistic and worst-case analysis
and demonstrate the proposed framework experimentally on a quadrotor vehicle.
Even though safety analysis is based on a simple point-mass model, the
quadrotor successfully arrives at a suitable controller by policy-gradient
reinforcement learning without ever crashing, and safely retracts away from a
strong external disturbance introduced during flight.Comment: Accepted for publication in IEEE Transactions on Automatic Control.
Video with experiments: https://youtu.be/WAAxyeSk2b
- …