1,692 research outputs found
Imitating Driver Behavior with Generative Adversarial Networks
The ability to accurately predict and simulate human driving behavior is
critical for the development of intelligent transportation systems. Traditional
modeling methods have employed simple parametric models and behavioral cloning.
This paper adopts a method for overcoming the problem of cascading errors
inherent in prior approaches, resulting in realistic behavior that is robust to
trajectory perturbations. We extend Generative Adversarial Imitation Learning
to the training of recurrent policies, and we demonstrate that our model
outperforms rule-based controllers and maximum likelihood models in realistic
highway simulations. Our model both reproduces emergent behavior of human
drivers, such as lane change rate, while maintaining realistic control over
long time horizons.Comment: 8 pages, 6 figure
Visual Imitation Learning with Recurrent Siamese Networks
It would be desirable for a reinforcement learning (RL) based agent to learn
behaviour by merely watching a demonstration. However, defining rewards that
facilitate this goal within the RL paradigm remains a challenge. Here we
address this problem with Siamese networks, trained to compute distances
between observed behaviours and the agent's behaviours. Given a desired motion
such Siamese networks can be used to provide a reward signal to an RL agent via
the distance between the desired motion and the agent's motion. We experiment
with an RNN-based comparator model that can compute distances in space and time
between motion clips while training an RL policy to minimize this distance.
Through experimentation, we have had also found that the inclusion of
multi-task data and an additional image encoding loss helps enforce the
temporal consistency. These two components appear to balance reward for
matching a specific instance of behaviour versus that behaviour in general.
Furthermore, we focus here on a particularly challenging form of this problem
where only a single demonstration is provided for a given task -- the one-shot
learning setting. We demonstrate our approach on humanoid agents in both 2D
with degrees of freedom (DoF) and 3D with DoF.Comment: PrePrin
- …