Event cameras, such as dynamic vision sensors (DVS), and dynamic and
active-pixel vision sensors (DAVIS) can supplement other autonomous driving
sensors by providing a concurrent stream of standard active pixel sensor (APS)
images and DVS temporal contrast events. The APS stream is a sequence of
standard grayscale global-shutter image sensor frames. The DVS events represent
brightness changes occurring at a particular moment, with a jitter of about a
millisecond under most lighting conditions. They have a dynamic range of >120
dB and effective frame rates >1 kHz at data rates comparable to 30 fps
(frames/second) image sensors. To overcome some of the limitations of current
image acquisition technology, we investigate in this work the use of the
combined DVS and APS streams in end-to-end driving applications. The dataset
DDD17 accompanying this paper is the first open dataset of annotated DAVIS
driving recordings. DDD17 has over 12 h of a 346x260 pixel DAVIS sensor
recording highway and city driving in daytime, evening, night, dry and wet
weather conditions, along with vehicle speed, GPS position, driver steering,
throttle, and brake captured from the car's on-board diagnostics interface. As
an example application, we performed a preliminary end-to-end learning study of
using a convolutional neural network that is trained to predict the
instantaneous steering angle from DVS and APS visual data.Comment: Presented at the ICML 2017 Workshop on Machine Learning for
Autonomous Vehicle