1,622 research outputs found
Safe Exploration for Optimization with Gaussian Processes
We consider sequential decision problems under uncertainty, where we seek to optimize an unknown function from noisy samples. This requires balancing exploration (learning about the objective) and exploitation (localizing the maximum), a problem well-studied in the multi-armed bandit literature. In many applications, however, we require that the sampled function values exceed some prespecified "safety" threshold, a requirement that existing algorithms fail to meet. Examples include medical applications where patient comfort must be guaranteed, recommender systems aiming to avoid user dissatisfaction, and robotic control, where one seeks to avoid controls causing physical harm to the platform. We tackle this novel, yet rich, set of problems under the assumption that the unknown function satisfies regularity conditions expressed via a Gaussian process prior. We develop an efficient algorithm called SafeOpt, and theoretically guarantee its convergence to a natural notion of optimum reachable under safety constraints. We evaluate SafeOpt on synthetic data, as well as two real applications: movie recommendation, and therapeutic spinal cord stimulation
Safe Learning of Quadrotor Dynamics Using Barrier Certificates
To effectively control complex dynamical systems, accurate nonlinear models
are typically needed. However, these models are not always known. In this
paper, we present a data-driven approach based on Gaussian processes that
learns models of quadrotors operating in partially unknown environments. What
makes this challenging is that if the learning process is not carefully
controlled, the system will go unstable, i.e., the quadcopter will crash. To
this end, barrier certificates are employed for safe learning. The barrier
certificates establish a non-conservative forward invariant safe region, in
which high probability safety guarantees are provided based on the statistics
of the Gaussian Process. A learning controller is designed to efficiently
explore those uncertain states and expand the barrier certified safe region
based on an adaptive sampling scheme. In addition, a recursive Gaussian Process
prediction method is developed to learn the complex quadrotor dynamics in
real-time. Simulation results are provided to demonstrate the effectiveness of
the proposed approach.Comment: Submitted to ICRA 2018, 8 page
- …