5,646 research outputs found
DeepDriving: Learning Affordance for Direct Perception in Autonomous Driving
Today, there are two major paradigms for vision-based autonomous driving
systems: mediated perception approaches that parse an entire scene to make a
driving decision, and behavior reflex approaches that directly map an input
image to a driving action by a regressor. In this paper, we propose a third
paradigm: a direct perception approach to estimate the affordance for driving.
We propose to map an input image to a small number of key perception indicators
that directly relate to the affordance of a road/traffic state for driving. Our
representation provides a set of compact yet complete descriptions of the scene
to enable a simple controller to drive autonomously. Falling in between the two
extremes of mediated perception and behavior reflex, we argue that our direct
perception representation provides the right level of abstraction. To
demonstrate this, we train a deep Convolutional Neural Network using recording
from 12 hours of human driving in a video game and show that our model can work
well to drive a car in a very diverse set of virtual environments. We also
train a model for car distance estimation on the KITTI dataset. Results show
that our direct perception approach can generalize well to real driving images.
Source code and data are available on our project website
Static and Dynamic Affordance Learning in Vision-based Direct Perception for Autonomous Driving
The recent development in autonomous driving involves high-level computer vision and detailed road scene understanding. Today, most autonomous vehicles are using the mediated perception approach for path planning and control, which highly rely on high-definition 3D maps and real-time sensors. Recent research efforts aim to substitute the massive HD maps with coarse road attributes. In this thesis, We follow the direct perception-based method to train a deep neural network for affordance learning in autonomous driving. The goal and the main contributions of this thesis are in two folds. Firstly, to develop the affordance learning model based on freely available Google Street View panoramas and Open Street Map road vector attributes. Driving scene understanding can be achieved by learning affordances from the images captured by car-mounted cameras. Such scene understanding by learning affordances may be useful for corroborating base-maps such as HD maps so that the required data storage space is minimized and available for processing in real-time. We compare capability in road attribute identification between human volunteers and the trained model by experimental evaluation. The results indicate that this method could act as a cheaper way for training data collection in autonomous driving. The cross-validation results also indicate the effectiveness of the trained model. Secondly, We propose a scalable and affordable data collection framework named I2MAP (image-to-map annotation proximity algorithm) for autonomous driving systems. We built an automated labeling pipeline with both vehicle dynamics and static road attributes. The data collected and annotated under our framework is suitable for direct perception and end-to-end imitation learning. Our benchmark consists of 40,000 images with more than 40 affordance labels under various day time and weather even with very challenging heavy snow. We train and evaluate a ConvNet based traffic flow prediction model for driver warning and suggestion under low visibility condition
Virtual to Real Reinforcement Learning for Autonomous Driving
Reinforcement learning is considered as a promising direction for driving
policy learning. However, training autonomous driving vehicle with
reinforcement learning in real environment involves non-affordable
trial-and-error. It is more desirable to first train in a virtual environment
and then transfer to the real environment. In this paper, we propose a novel
realistic translation network to make model trained in virtual environment be
workable in real world. The proposed network can convert non-realistic virtual
image input into a realistic one with similar scene structure. Given realistic
frames as input, driving policy trained by reinforcement learning can nicely
adapt to real world driving. Experiments show that our proposed virtual to real
(VR) reinforcement learning (RL) works pretty well. To our knowledge, this is
the first successful case of driving policy trained by reinforcement learning
that can adapt to real world driving data
- …