30,863 research outputs found
Driving with Style: Inverse Reinforcement Learning in General-Purpose Planning for Automated Driving
Behavior and motion planning play an important role in automated driving.
Traditionally, behavior planners instruct local motion planners with predefined
behaviors. Due to the high scene complexity in urban environments,
unpredictable situations may occur in which behavior planners fail to match
predefined behavior templates. Recently, general-purpose planners have been
introduced, combining behavior and local motion planning. These general-purpose
planners allow behavior-aware motion planning given a single reward function.
However, two challenges arise: First, this function has to map a complex
feature space into rewards. Second, the reward function has to be manually
tuned by an expert. Manually tuning this reward function becomes a tedious
task. In this paper, we propose an approach that relies on human driving
demonstrations to automatically tune reward functions. This study offers
important insights into the driving style optimization of general-purpose
planners with maximum entropy inverse reinforcement learning. We evaluate our
approach based on the expected value difference between learned and
demonstrated policies. Furthermore, we compare the similarity of human driven
trajectories with optimal policies of our planner under learned and
expert-tuned reward functions. Our experiments show that we are able to learn
reward functions exceeding the level of manual expert tuning without prior
domain knowledge.Comment: Appeared at IROS 2019. Accepted version. Added/updated footnote,
minor correction in preliminarie
Inverse Optimal Planning for Air Traffic Control
We envision a system that concisely describes the rules of air traffic
control, assists human operators and supports dense autonomous air traffic
around commercial airports. We develop a method to learn the rules of air
traffic control from real data as a cost function via maximum entropy inverse
reinforcement learning. This cost function is used as a penalty for a
search-based motion planning method that discretizes both the control and the
state space. We illustrate the methodology by showing that our approach can
learn to imitate the airport arrival routes and separation rules of dense
commercial air traffic. The resulting trajectories are shown to be safe,
feasible, and efficient
- …