Search CORE

6,749 research outputs found

Safe Learning of Quadrotor Dynamics Using Barrier Certificates

Author: Egerstedt Magnus
Theodorou Evangelos A.
Wang Li
Publication venue
Publication date: 15/10/2017
Field of study

To effectively control complex dynamical systems, accurate nonlinear models are typically needed. However, these models are not always known. In this paper, we present a data-driven approach based on Gaussian processes that learns models of quadrotors operating in partially unknown environments. What makes this challenging is that if the learning process is not carefully controlled, the system will go unstable, i.e., the quadcopter will crash. To this end, barrier certificates are employed for safe learning. The barrier certificates establish a non-conservative forward invariant safe region, in which high probability safety guarantees are provided based on the statistics of the Gaussian Process. A learning controller is designed to efficiently explore those uncertain states and expand the barrier certified safe region based on an adaptive sampling scheme. In addition, a recursive Gaussian Process prediction method is developed to learn the complex quadrotor dynamics in real-time. Simulation results are provided to demonstrate the effectiveness of the proposed approach.Comment: Submitted to ICRA 2018, 8 page

arXiv.org e-Print Archive

Crossref

Automating Vehicles by Deep Reinforcement Learning using Task Separation with Hill Climbing

Author: A Liniger
B Paden
C Urmson
CW Anderson
D Dolgov
D Wierstra
DQ Mayne
E Frazzoli
HT Siegelmann
J Xu
P Falcone
R Tedrake
T Schouwenaars
Publication venue
Publication date: 02/08/2018
Field of study

Within the context of autonomous driving a model-based reinforcement learning algorithm is proposed for the design of neural network-parameterized controllers. Classical model-based control methods, which include sampling- and lattice-based algorithms and model predictive control, suffer from the trade-off between model complexity and computational burden required for the online solution of expensive optimization or search problems at every short sampling time. To circumvent this trade-off, a 2-step procedure is motivated: first learning of a controller during offline training based on an arbitrarily complicated mathematical system model, before online fast feedforward evaluation of the trained controller. The contribution of this paper is the proposition of a simple gradient-free and model-based algorithm for deep reinforcement learning using task separation with hill climbing (TSHC). In particular, (i) simultaneous training on separate deterministic tasks with the purpose of encoding many motion primitives in a neural network, and (ii) the employment of maximally sparse rewards in combination with virtual velocity constraints (VVCs) in setpoint proximity are advocated.Comment: 10 pages, 6 figures, 1 tabl

arXiv.org e-Print Archive

Crossref

Tools for Nonlinear Control Systems Design

Author: Sastry S. S.
Publication venue
Publication date
Field of study

This is a brief statement of the research progress made on Grant NAG2-243 titled "Tools for Nonlinear Control Systems Design", which ran from 1983 till December 1996. The initial set of PIs on the grant were C. A. Desoer, E. L. Polak and myself (for 1983). From 1984 till 1991 Desoer and I were the Pls and finally I was the sole PI from 1991 till the end of 1996. The project has been an unusually longstanding and extremely fruitful partnership, with many technical exchanges, visits, workshops and new avenues of investigation begun on this grant. There were student visits, long term.visitors on the grant and many interesting joint projects. In this final report I will only give a cursory description of the technical work done on the grant, since there was a tradition of annual progress reports and a proposal for the succeeding year. These progress reports cum proposals are attached as Appendix A to this report. Appendix B consists of papers by me and my students as co-authors sorted chronologically. When there are multiple related versions of a paper, such as a conference version and journal version they are listed together. Appendix C consists of papers by Desoer and his students as well as 'solo' publications by other researchers supported on this grant similarly chronologically sorted

NASA Technical Reports Server