Search CORE

37,044 research outputs found

Difference of Convex Functions Programming Applied to Control with Expert Data

Author: Geist Matthieu
Pietquin Olivier
Piot Bilal
Publication venue
Publication date: 05/09/2016
Field of study

This paper reports applications of Difference of Convex functions (DC) programming to Learning from Demonstrations (LfD) and Reinforcement Learning (RL) with expert data. This is made possible because the norm of the Optimal Bellman Residual (OBR), which is at the heart of many RL and LfD algorithms, is DC. Improvement in performance is demonstrated on two specific algorithms, namely Reward-regularized Classification for Apprenticeship Learning (RCAL) and Reinforcement Learning with Expert Demonstrations (RLED), through experiments on generic Markov Decision Processes (MDP), called Garnets

arXiv.org e-Print Archive

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

HAL-Rennes 1

Inverse Optimal Planning for Air Traffic Control

Author: Kapoor Ashish
Kumar Vijay
Ribeiro Alejandro
Tolstaya Ekaterina
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 25/03/2019
Field of study

We envision a system that concisely describes the rules of air traffic control, assists human operators and supports dense autonomous air traffic around commercial airports. We develop a method to learn the rules of air traffic control from real data as a cost function via maximum entropy inverse reinforcement learning. This cost function is used as a penalty for a search-based motion planning method that discretizes both the control and the state space. We illustrate the methodology by showing that our approach can learn to imitate the airport arrival routes and separation rules of dense commercial air traffic. The resulting trajectories are shown to be safe, feasible, and efficient

arXiv.org e-Print Archive

Crossref