3,217 research outputs found
Event-based Vision: A Survey
Event cameras are bio-inspired sensors that differ from conventional frame
cameras: Instead of capturing images at a fixed rate, they asynchronously
measure per-pixel brightness changes, and output a stream of events that encode
the time, location and sign of the brightness changes. Event cameras offer
attractive properties compared to traditional cameras: high temporal resolution
(in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low
power consumption, and high pixel bandwidth (on the order of kHz) resulting in
reduced motion blur. Hence, event cameras have a large potential for robotics
and computer vision in challenging scenarios for traditional cameras, such as
low-latency, high speed, and high dynamic range. However, novel methods are
required to process the unconventional output of these sensors in order to
unlock their potential. This paper provides a comprehensive overview of the
emerging field of event-based vision, with a focus on the applications and the
algorithms developed to unlock the outstanding properties of event cameras. We
present event cameras from their working principle, the actual sensors that are
available and the tasks that they have been used for, from low-level vision
(feature detection and tracking, optic flow, etc.) to high-level vision
(reconstruction, segmentation, recognition). We also discuss the techniques
developed to process events, including learning-based techniques, as well as
specialized processors for these novel sensors, such as spiking neural
networks. Additionally, we highlight the challenges that remain to be tackled
and the opportunities that lie ahead in the search for a more efficient,
bio-inspired way for machines to perceive and interact with the world
Independent Motion Detection with Event-driven Cameras
Unlike standard cameras that send intensity images at a constant frame rate,
event-driven cameras asynchronously report pixel-level brightness changes,
offering low latency and high temporal resolution (both in the order of
micro-seconds). As such, they have great potential for fast and low power
vision algorithms for robots. Visual tracking, for example, is easily achieved
even for very fast stimuli, as only moving objects cause brightness changes.
However, cameras mounted on a moving robot are typically non-stationary and the
same tracking problem becomes confounded by background clutter events due to
the robot ego-motion. In this paper, we propose a method for segmenting the
motion of an independently moving object for event-driven cameras. Our method
detects and tracks corners in the event stream and learns the statistics of
their motion as a function of the robot's joint velocities when no
independently moving objects are present. During robot operation, independently
moving objects are identified by discrepancies between the predicted corner
velocities from ego-motion and the measured corner velocities. We validate the
algorithm on data collected from the neuromorphic iCub robot. We achieve a
precision of ~ 90 % and show that the method is robust to changes in speed of
both the head and the target.Comment: 7 pages, 6 figure
DART: Distribution Aware Retinal Transform for Event-based Cameras
We introduce a generic visual descriptor, termed as distribution aware
retinal transform (DART), that encodes the structural context using log-polar
grids for event cameras. The DART descriptor is applied to four different
problems, namely object classification, tracking, detection and feature
matching: (1) The DART features are directly employed as local descriptors in a
bag-of-features classification framework and testing is carried out on four
standard event-based object datasets (N-MNIST, MNIST-DVS, CIFAR10-DVS,
NCaltech-101). (2) Extending the classification system, tracking is
demonstrated using two key novelties: (i) For overcoming the low-sample problem
for the one-shot learning of a binary classifier, statistical bootstrapping is
leveraged with online learning; (ii) To achieve tracker robustness, the scale
and rotation equivariance property of the DART descriptors is exploited for the
one-shot learning. (3) To solve the long-term object tracking problem, an
object detector is designed using the principle of cluster majority voting. The
detection scheme is then combined with the tracker to result in a high
intersection-over-union score with augmented ground truth annotations on the
publicly available event camera dataset. (4) Finally, the event context encoded
by DART greatly simplifies the feature correspondence problem, especially for
spatio-temporal slices far apart in time, which has not been explicitly tackled
in the event-based vision domain.Comment: 12 pages, revision submitted to TPAMI in Nov 201
CED: Color Event Camera Dataset
Event cameras are novel, bio-inspired visual sensors, whose pixels output
asynchronous and independent timestamped spikes at local intensity changes,
called 'events'. Event cameras offer advantages over conventional frame-based
cameras in terms of latency, high dynamic range (HDR) and temporal resolution.
Until recently, event cameras have been limited to outputting events in the
intensity channel, however, recent advances have resulted in the development of
color event cameras, such as the Color-DAVIS346. In this work, we present and
release the first Color Event Camera Dataset (CED), containing 50 minutes of
footage with both color frames and events. CED features a wide variety of
indoor and outdoor scenes, which we hope will help drive forward event-based
vision research. We also present an extension of the event camera simulator
ESIM that enables simulation of color events. Finally, we present an evaluation
of three state-of-the-art image reconstruction methods that can be used to
convert the Color-DAVIS346 into a continuous-time, HDR, color video camera to
visualise the event stream, and for use in downstream vision applications.Comment: Conference on Computer Vision and Pattern Recognition Workshop
Asynchronous Corner Tracking Algorithm based on Lifetime of Events for DAVIS Cameras
Event cameras, i.e., the Dynamic and Active-pixel Vision Sensor (DAVIS) ones,
capture the intensity changes in the scene and generates a stream of events in
an asynchronous fashion. The output rate of such cameras can reach up to 10
million events per second in high dynamic environments. DAVIS cameras use novel
vision sensors that mimic human eyes. Their attractive attributes, such as high
output rate, High Dynamic Range (HDR), and high pixel bandwidth, make them an
ideal solution for applications that require high-frequency tracking. Moreover,
applications that operate in challenging lighting scenarios can exploit the
high HDR of event cameras, i.e., 140 dB compared to 60 dB of traditional
cameras. In this paper, a novel asynchronous corner tracking method is proposed
that uses both events and intensity images captured by a DAVIS camera. The
Harris algorithm is used to extract features, i.e., frame-corners from
keyframes, i.e., intensity images. Afterward, a matching algorithm is used to
extract event-corners from the stream of events. Events are solely used to
perform asynchronous tracking until the next keyframe is captured. Neighboring
events, within a window size of 5x5 pixels around the event-corner, are used to
calculate the velocity and direction of extracted event-corners by fitting the
2D planar using a randomized Hough transform algorithm. Experimental evaluation
showed that our approach is able to update the location of the extracted
corners up to 100 times during the blind time of traditional cameras, i.e.,
between two consecutive intensity images.Comment: Accepted to 15th International Symposium on Visual Computing
(ISVC2020
A distributed camera system for multi-resolution surveillance
We describe an architecture for a multi-camera, multi-resolution surveillance system. The aim is to support a set of distributed static and pan-tilt-zoom (PTZ) cameras and visual tracking algorithms, together with a central supervisor unit. Each camera (and possibly pan-tilt device) has a dedicated process and processor.
Asynchronous interprocess communications and archiving of data are achieved in a simple and effective way via a central repository, implemented using an SQL database.
Visual tracking data from static views are stored dynamically into tables in the database via client calls to the SQL server. A supervisor process running on the SQL server determines if active zoom cameras should be dispatched to observe a particular target, and this message is effected via writing demands into another database table.
We show results from a real implementation of the system comprising one static camera overviewing the environment under consideration and a PTZ camera operating
under closed-loop velocity control, which uses a fast and robust level-set-based region tracker. Experiments demonstrate the effectiveness of our approach and its feasibility to multi-camera systems for intelligent surveillance
- …