3,499 research outputs found
Event-based Vision meets Deep Learning on Steering Prediction for Self-driving Cars
Event cameras are bio-inspired vision sensors that naturally capture the
dynamics of a scene, filtering out redundant information. This paper presents a
deep neural network approach that unlocks the potential of event cameras on a
challenging motion-estimation task: prediction of a vehicle's steering angle.
To make the best out of this sensor-algorithm combination, we adapt
state-of-the-art convolutional architectures to the output of event sensors and
extensively evaluate the performance of our approach on a publicly available
large scale event-camera dataset (~1000 km). We present qualitative and
quantitative explanations of why event cameras allow robust steering prediction
even in cases where traditional cameras fail, e.g. challenging illumination
conditions and fast motion. Finally, we demonstrate the advantages of
leveraging transfer learning from traditional to event-based vision, and show
that our approach outperforms state-of-the-art algorithms based on standard
cameras.Comment: 9 pages, 8 figures, 6 tables. Video: https://youtu.be/_r_bsjkJTH
Event-Based Algorithms For Geometric Computer Vision
Event cameras are novel bio-inspired sensors which mimic the function of the human retina. Rather than directly capturing intensities to form synchronous images as in traditional cameras, event cameras asynchronously detect changes in log image intensity. When such a change is detected at a given pixel, the change is immediately sent to the host computer, where each event consists of the x,y pixel position of the change, a timestamp, accurate to tens of microseconds, and a polarity, indicating whether the pixel got brighter or darker. These cameras provide a number of useful benefits over traditional cameras, including the ability to track extremely fast motions, high dynamic range, and low power consumption.
However, with a new sensing modality comes the need to develop novel algorithms. As these cameras do not capture photometric intensities, novel loss functions must be developed to replace the photoconsistency assumption which serves as the backbone of many classical computer vision algorithms. In addition, the relative novelty of these sensors means that there does not exist the wealth of data available for traditional images with which we can train learning based methods such as deep neural networks.
In this work, we address both of these issues with two foundational principles. First, we show that the motion blur induced when the events are projected into the 2D image plane can be used as a suitable substitute for the classical photometric loss function. Second, we develop self-supervised learning methods which allow us to train convolutional neural networks to estimate motion without any labeled training data. We apply these principles to solve classical perception problems such as feature tracking, visual inertial odometry, optical flow and stereo depth estimation, as well as recognition tasks such as object detection and human pose estimation. We show that these solutions are able to utilize the benefits of event cameras, allowing us to operate in fast moving scenes with challenging lighting which would be incredibly difficult for traditional cameras
Past, Present, and Future of Simultaneous Localization And Mapping: Towards the Robust-Perception Age
Simultaneous Localization and Mapping (SLAM)consists in the concurrent
construction of a model of the environment (the map), and the estimation of the
state of the robot moving within it. The SLAM community has made astonishing
progress over the last 30 years, enabling large-scale real-world applications,
and witnessing a steady transition of this technology to industry. We survey
the current state of SLAM. We start by presenting what is now the de-facto
standard formulation for SLAM. We then review related work, covering a broad
set of topics including robustness and scalability in long-term mapping, metric
and semantic representations for mapping, theoretical performance guarantees,
active SLAM and exploration, and other new frontiers. This paper simultaneously
serves as a position paper and tutorial to those who are users of SLAM. By
looking at the published research with a critical eye, we delineate open
challenges and new research issues, that still deserve careful scientific
investigation. The paper also contains the authors' take on two questions that
often animate discussions during robotics conferences: Do robots need SLAM? and
Is SLAM solved
Estimating general motion and intensity from event cameras
Robotic vision algorithms have become widely used in many consumer products which
enabled technologies such as autonomous vehicles, drones, augmented reality (AR) and
virtual reality (VR) devices to name a few. These applications require vision algorithms
to work in real-world environments with extreme lighting variations and fast moving
objects. However, robotic vision applications rely often on standard video cameras which
face severe limitations in fast-moving scenes or by bright light sources which diminish
the image quality with artefacts like motion blur or over-saturation.
To address these limitations, the body of work presented here investigates the use of
alternative sensor devices which mimic the superior perception properties of human
vision. Such silicon retinas were proposed by neuromorphic engineering, and we focus
here on one such biologically inspired sensor called the event camera which offers a new
camera paradigm for real-time robotic vision. The camera provides a high measurement
rate, low latency, high dynamic range, and low data rate. The signal of the camera is
composed of a stream of asynchronous events at microsecond resolution. Each event
indicates when individual pixels registers a logarithmic intensity changes of a pre-set
threshold size. Using this novel signal has proven to be very challenging in most computer
vision problems since common vision methods require synchronous absolute intensity
information.
In this thesis, we present for the first time a method to reconstruct an image and es-
timation motion from an event stream without additional sensing or prior knowledge of
the scene. This method is based on coupled estimations of both motion and intensity
which enables our event-based analysis, which was previously only possible with severe
limitations. We also present the first machine learning algorithm for event-based unsu-
pervised intensity reconstruction which does not depend on an explicit motion estimation
and reveals finer image details. This learning approach does not rely on event-to-image
examples, but learns from standard camera image examples which are not coupled to the
event data. In experiments we show that the learned reconstruction improves upon our
handcrafted approach. Finally, we combine our learned approach with motion estima-
tion methods and show the improved intensity reconstruction also significantly improves
the motion estimation results. We hope our work in this thesis bridges the gap between
the event signal and images and that it opens event cameras to practical solutions to
overcome the current limitations of frame-based cameras in robotic vision.Open Acces
- …