545 research outputs found
Asynchronous, Photometric Feature Tracking using Events and Frames
We present a method that leverages the complementarity of event cameras and
standard cameras to track visual features with low-latency. Event cameras are
novel sensors that output pixel-level brightness changes, called "events". They
offer significant advantages over standard cameras, namely a very high dynamic
range, no motion blur, and a latency in the order of microseconds. However,
because the same scene pattern can produce different events depending on the
motion direction, establishing event correspondences across time is
challenging. By contrast, standard cameras provide intensity measurements
(frames) that do not depend on motion direction. Our method extracts features
on frames and subsequently tracks them asynchronously using events, thereby
exploiting the best of both types of data: the frames provide a photometric
representation that does not depend on motion direction and the events provide
low-latency updates. In contrast to previous works, which are based on
heuristics, this is the first principled method that uses raw intensity
measurements directly, based on a generative event model within a
maximum-likelihood framework. As a result, our method produces feature tracks
that are both more accurate (subpixel accuracy) and longer than the state of
the art, across a wide variety of scenes.Comment: 22 pages, 15 figures, Video: https://youtu.be/A7UfeUnG6c
Event-based Vision: A Survey
Event cameras are bio-inspired sensors that differ from conventional frame
cameras: Instead of capturing images at a fixed rate, they asynchronously
measure per-pixel brightness changes, and output a stream of events that encode
the time, location and sign of the brightness changes. Event cameras offer
attractive properties compared to traditional cameras: high temporal resolution
(in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low
power consumption, and high pixel bandwidth (on the order of kHz) resulting in
reduced motion blur. Hence, event cameras have a large potential for robotics
and computer vision in challenging scenarios for traditional cameras, such as
low-latency, high speed, and high dynamic range. However, novel methods are
required to process the unconventional output of these sensors in order to
unlock their potential. This paper provides a comprehensive overview of the
emerging field of event-based vision, with a focus on the applications and the
algorithms developed to unlock the outstanding properties of event cameras. We
present event cameras from their working principle, the actual sensors that are
available and the tasks that they have been used for, from low-level vision
(feature detection and tracking, optic flow, etc.) to high-level vision
(reconstruction, segmentation, recognition). We also discuss the techniques
developed to process events, including learning-based techniques, as well as
specialized processors for these novel sensors, such as spiking neural
networks. Additionally, we highlight the challenges that remain to be tackled
and the opportunities that lie ahead in the search for a more efficient,
bio-inspired way for machines to perceive and interact with the world
CED: Color Event Camera Dataset
Event cameras are novel, bio-inspired visual sensors, whose pixels output
asynchronous and independent timestamped spikes at local intensity changes,
called 'events'. Event cameras offer advantages over conventional frame-based
cameras in terms of latency, high dynamic range (HDR) and temporal resolution.
Until recently, event cameras have been limited to outputting events in the
intensity channel, however, recent advances have resulted in the development of
color event cameras, such as the Color-DAVIS346. In this work, we present and
release the first Color Event Camera Dataset (CED), containing 50 minutes of
footage with both color frames and events. CED features a wide variety of
indoor and outdoor scenes, which we hope will help drive forward event-based
vision research. We also present an extension of the event camera simulator
ESIM that enables simulation of color events. Finally, we present an evaluation
of three state-of-the-art image reconstruction methods that can be used to
convert the Color-DAVIS346 into a continuous-time, HDR, color video camera to
visualise the event stream, and for use in downstream vision applications.Comment: Conference on Computer Vision and Pattern Recognition Workshop
EV-FlowNet: Self-Supervised Optical Flow Estimation for Event-based Cameras
Event-based cameras have shown great promise in a variety of situations where
frame based cameras suffer, such as high speed motions and high dynamic range
scenes. However, developing algorithms for event measurements requires a new
class of hand crafted algorithms. Deep learning has shown great success in
providing model free solutions to many problems in the vision community, but
existing networks have been developed with frame based images in mind, and
there does not exist the wealth of labeled data for events as there does for
images for supervised training. To these points, we present EV-FlowNet, a novel
self-supervised deep learning pipeline for optical flow estimation for event
based cameras. In particular, we introduce an image based representation of a
given event stream, which is fed into a self-supervised neural network as the
sole input. The corresponding grayscale images captured from the same camera at
the same time as the events are then used as a supervisory signal to provide a
loss function at training time, given the estimated flow from the network. We
show that the resulting network is able to accurately predict optical flow from
events only in a variety of different scenes, with performance competitive to
image based networks. This method not only allows for accurate estimation of
dense optical flow, but also provides a framework for the transfer of other
self-supervised methods to the event-based domain.Comment: 9 pages, 5 figures, 1 table. Accompanying video:
https://youtu.be/eMHZBSoq0sE. Dataset:
https://daniilidis-group.github.io/mvsec/, Robotics: Science and Systems 201
Event-based Simultaneous Localization and Mapping: A Comprehensive Survey
In recent decades, visual simultaneous localization and mapping (vSLAM) has
gained significant interest in both academia and industry. It estimates camera
motion and reconstructs the environment concurrently using visual sensors on a
moving robot. However, conventional cameras are limited by hardware, including
motion blur and low dynamic range, which can negatively impact performance in
challenging scenarios like high-speed motion and high dynamic range
illumination. Recent studies have demonstrated that event cameras, a new type
of bio-inspired visual sensor, offer advantages such as high temporal
resolution, dynamic range, low power consumption, and low latency. This paper
presents a timely and comprehensive review of event-based vSLAM algorithms that
exploit the benefits of asynchronous and irregular event streams for
localization and mapping tasks. The review covers the working principle of
event cameras and various event representations for preprocessing event data.
It also categorizes event-based vSLAM methods into four main categories:
feature-based, direct, motion-compensation, and deep learning methods, with
detailed discussions and practical guidance for each approach. Furthermore, the
paper evaluates the state-of-the-art methods on various benchmarks,
highlighting current challenges and future opportunities in this emerging
research area. A public repository will be maintained to keep track of the
rapid developments in this field at
{\url{https://github.com/kun150kun/ESLAM-survey}}
EventCap: Monocular 3D Capture of High-Speed Human Motions using an Event Camera
The high frame rate is a critical requirement for capturing fast human motions. In this setting, existing markerless image-based methods are constrained by the lighting requirement, the high data bandwidth and the consequent high computation overhead. In this paper, we propose EventCap --- the first approach for 3D capturing of high-speed human motions using a single event camera. Our method combines model-based optimization and CNN-based human pose detection to capture high-frequency motion details and to reduce the drifting in the tracking. As a result, we can capture fast motions at millisecond resolution with significantly higher data efficiency than using high frame rate videos. Experiments on our new event-based fast human motion dataset demonstrate the effectiveness and accuracy of our method, as well as its robustness to challenging lighting conditions
- …