5,836 research outputs found
Event-based Vision meets Deep Learning on Steering Prediction for Self-driving Cars
Event cameras are bio-inspired vision sensors that naturally capture the
dynamics of a scene, filtering out redundant information. This paper presents a
deep neural network approach that unlocks the potential of event cameras on a
challenging motion-estimation task: prediction of a vehicle's steering angle.
To make the best out of this sensor-algorithm combination, we adapt
state-of-the-art convolutional architectures to the output of event sensors and
extensively evaluate the performance of our approach on a publicly available
large scale event-camera dataset (~1000 km). We present qualitative and
quantitative explanations of why event cameras allow robust steering prediction
even in cases where traditional cameras fail, e.g. challenging illumination
conditions and fast motion. Finally, we demonstrate the advantages of
leveraging transfer learning from traditional to event-based vision, and show
that our approach outperforms state-of-the-art algorithms based on standard
cameras.Comment: 9 pages, 8 figures, 6 tables. Video: https://youtu.be/_r_bsjkJTH
DDD20 End-to-End Event Camera Driving Dataset: Fusing Frames and Events with Deep Learning for Improved Steering Prediction
Neuromorphic event cameras are useful for dynamic vision problems under
difficult lighting conditions. To enable studies of using event cameras in
automobile driving applications, this paper reports a new end-to-end driving
dataset called DDD20. The dataset was captured with a DAVIS camera that
concurrently streams both dynamic vision sensor (DVS) brightness change events
and active pixel sensor (APS) intensity frames. DDD20 is the longest event
camera end-to-end driving dataset to date with 51h of DAVIS event+frame camera
and vehicle human control data collected from 4000km of highway and urban
driving under a variety of lighting conditions. Using DDD20, we report the
first study of fusing brightness change events and intensity frame data using a
deep learning approach to predict the instantaneous human steering wheel angle.
Over all day and night conditions, the explained variance for human steering
prediction from a Resnet-32 is significantly better from the fused DVS+APS
frames (0.88) than using either DVS (0.67) or APS (0.77) data alone.Comment: Accepted in The 23rd IEEE International Conference on Intelligent
Transportation Systems (Special Session: Beyond Traditional Sensing for
Intelligent Transportation
End-to-End Learning of Representations for Asynchronous Event-Based Data
Event cameras are vision sensors that record asynchronous streams of
per-pixel brightness changes, referred to as "events". They have appealing
advantages over frame-based cameras for computer vision, including high
temporal resolution, high dynamic range, and no motion blur. Due to the sparse,
non-uniform spatiotemporal layout of the event signal, pattern recognition
algorithms typically aggregate events into a grid-based representation and
subsequently process it by a standard vision pipeline, e.g., Convolutional
Neural Network (CNN). In this work, we introduce a general framework to convert
event streams into grid-based representations through a sequence of
differentiable operations. Our framework comes with two main advantages: (i)
allows learning the input event representation together with the task dedicated
network in an end to end manner, and (ii) lays out a taxonomy that unifies the
majority of extant event representations in the literature and identifies novel
ones. Empirically, we show that our approach to learning the event
representation end-to-end yields an improvement of approximately 12% on optical
flow estimation and object recognition over state-of-the-art methods.Comment: To appear at ICCV 201
End-to-End Learning of Driving Models with Surround-View Cameras and Route Planners
For human drivers, having rear and side-view mirrors is vital for safe
driving. They deliver a more complete view of what is happening around the car.
Human drivers also heavily exploit their mental map for navigation.
Nonetheless, several methods have been published that learn driving models with
only a front-facing camera and without a route planner. This lack of
information renders the self-driving task quite intractable. We investigate the
problem in a more realistic setting, which consists of a surround-view camera
system with eight cameras, a route planner, and a CAN bus reader. In
particular, we develop a sensor setup that provides data for a 360-degree view
of the area surrounding the vehicle, the driving route to the destination, and
low-level driving maneuvers (e.g. steering angle and speed) by human drivers.
With such a sensor setup we collect a new driving dataset, covering diverse
driving scenarios and varying weather/illumination conditions. Finally, we
learn a novel driving model by integrating information from the surround-view
cameras and the route planner. Two route planners are exploited: 1) by
representing the planned routes on OpenStreetMap as a stack of GPS coordinates,
and 2) by rendering the planned routes on TomTom Go Mobile and recording the
progression into a video. Our experiments show that: 1) 360-degree
surround-view cameras help avoid failures made with a single front-view camera,
in particular for city driving and intersection scenarios; and 2) route
planners help the driving task significantly, especially for steering angle
prediction.Comment: to be published at ECCV 201
- …