884 research outputs found
Bayesian Nonparametric Feature and Policy Learning for Decision-Making
Learning from demonstrations has gained increasing interest in the recent
past, enabling an agent to learn how to make decisions by observing an
experienced teacher. While many approaches have been proposed to solve this
problem, there is only little work that focuses on reasoning about the observed
behavior. We assume that, in many practical problems, an agent makes its
decision based on latent features, indicating a certain action. Therefore, we
propose a generative model for the states and actions. Inference reveals the
number of features, the features, and the policies, allowing us to learn and to
analyze the underlying structure of the observed behavior. Further, our
approach enables prediction of actions for new states. Simulations are used to
assess the performance of the algorithm based upon this model. Moreover, the
problem of learning a driver's behavior is investigated, demonstrating the
performance of the proposed model in a real-world scenario
Integration of Reinforcement Learning Based Behavior Planning With Sampling Based Motion Planning for Automated Driving
Reinforcement learning has received high research interest for developing
planning approaches in automated driving. Most prior works consider the
end-to-end planning task that yields direct control commands and rarely deploy
their algorithm to real vehicles. In this work, we propose a method to employ a
trained deep reinforcement learning policy for dedicated high-level behavior
planning. By populating an abstract objective interface, established motion
planning algorithms can be leveraged, which derive smooth and drivable
trajectories. Given the current environment model, we propose to use a built-in
simulator to predict the traffic scene for a given horizon into the future. The
behavior of automated vehicles in mixed traffic is determined by querying the
learned policy. To the best of our knowledge, this work is the first to apply
deep reinforcement learning in this manner, and as such lacks a
state-of-the-art benchmark. Thus, we validate the proposed approach by
comparing an idealistic single-shot plan with cyclic replanning through the
learned policy. Experiments with a real testing vehicle on proving grounds
demonstrate the potential of our approach to shrink the simulation to real
world gap of deep reinforcement learning based planning approaches. Additional
simulative analyses reveal that more complex multi-agent maneuvers can be
managed by employing the cycling replanning approach.Comment: 8 pages, 10 figures, to be published in 34th IEEE Intelligent
Vehicles Symposium (IV
- …