37,044 research outputs found
Difference of Convex Functions Programming Applied to Control with Expert Data
This paper reports applications of Difference of Convex functions (DC)
programming to Learning from Demonstrations (LfD) and Reinforcement Learning
(RL) with expert data. This is made possible because the norm of the Optimal
Bellman Residual (OBR), which is at the heart of many RL and LfD algorithms, is
DC. Improvement in performance is demonstrated on two specific algorithms,
namely Reward-regularized Classification for Apprenticeship Learning (RCAL) and
Reinforcement Learning with Expert Demonstrations (RLED), through experiments
on generic Markov Decision Processes (MDP), called Garnets
Inverse Optimal Planning for Air Traffic Control
We envision a system that concisely describes the rules of air traffic
control, assists human operators and supports dense autonomous air traffic
around commercial airports. We develop a method to learn the rules of air
traffic control from real data as a cost function via maximum entropy inverse
reinforcement learning. This cost function is used as a penalty for a
search-based motion planning method that discretizes both the control and the
state space. We illustrate the methodology by showing that our approach can
learn to imitate the airport arrival routes and separation rules of dense
commercial air traffic. The resulting trajectories are shown to be safe,
feasible, and efficient
- …