1,067,420 research outputs found
Event-based Vision: A Survey
Event cameras are bio-inspired sensors that differ from conventional frame
cameras: Instead of capturing images at a fixed rate, they asynchronously
measure per-pixel brightness changes, and output a stream of events that encode
the time, location and sign of the brightness changes. Event cameras offer
attractive properties compared to traditional cameras: high temporal resolution
(in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low
power consumption, and high pixel bandwidth (on the order of kHz) resulting in
reduced motion blur. Hence, event cameras have a large potential for robotics
and computer vision in challenging scenarios for traditional cameras, such as
low-latency, high speed, and high dynamic range. However, novel methods are
required to process the unconventional output of these sensors in order to
unlock their potential. This paper provides a comprehensive overview of the
emerging field of event-based vision, with a focus on the applications and the
algorithms developed to unlock the outstanding properties of event cameras. We
present event cameras from their working principle, the actual sensors that are
available and the tasks that they have been used for, from low-level vision
(feature detection and tracking, optic flow, etc.) to high-level vision
(reconstruction, segmentation, recognition). We also discuss the techniques
developed to process events, including learning-based techniques, as well as
specialized processors for these novel sensors, such as spiking neural
networks. Additionally, we highlight the challenges that remain to be tackled
and the opportunities that lie ahead in the search for a more efficient,
bio-inspired way for machines to perceive and interact with the world
Event-based Vision meets Deep Learning on Steering Prediction for Self-driving Cars
Event cameras are bio-inspired vision sensors that naturally capture the
dynamics of a scene, filtering out redundant information. This paper presents a
deep neural network approach that unlocks the potential of event cameras on a
challenging motion-estimation task: prediction of a vehicle's steering angle.
To make the best out of this sensor-algorithm combination, we adapt
state-of-the-art convolutional architectures to the output of event sensors and
extensively evaluate the performance of our approach on a publicly available
large scale event-camera dataset (~1000 km). We present qualitative and
quantitative explanations of why event cameras allow robust steering prediction
even in cases where traditional cameras fail, e.g. challenging illumination
conditions and fast motion. Finally, we demonstrate the advantages of
leveraging transfer learning from traditional to event-based vision, and show
that our approach outperforms state-of-the-art algorithms based on standard
cameras.Comment: 9 pages, 8 figures, 6 tables. Video: https://youtu.be/_r_bsjkJTH
The Event-Camera Dataset and Simulator: Event-based Data for Pose Estimation, Visual Odometry, and SLAM
New vision sensors, such as the Dynamic and Active-pixel Vision sensor
(DAVIS), incorporate a conventional global-shutter camera and an event-based
sensor in the same pixel array. These sensors have great potential for
high-speed robotics and computer vision because they allow us to combine the
benefits of conventional cameras with those of event-based sensors: low
latency, high temporal resolution, and very high dynamic range. However, new
algorithms are required to exploit the sensor characteristics and cope with its
unconventional output, which consists of a stream of asynchronous brightness
changes (called "events") and synchronous grayscale frames. For this purpose,
we present and release a collection of datasets captured with a DAVIS in a
variety of synthetic and real environments, which we hope will motivate
research on new algorithms for high-speed and high-dynamic-range robotics and
computer-vision applications. In addition to global-shutter intensity images
and asynchronous events, we provide inertial measurements and ground-truth
camera poses from a motion-capture system. The latter allows comparing the pose
accuracy of ego-motion estimation algorithms quantitatively. All the data are
released both as standard text files and binary files (i.e., rosbag). This
paper provides an overview of the available data and describes a simulator that
we release open-source to create synthetic event-camera data.Comment: 7 pages, 4 figures, 3 table
From Vision Sensor to Actuators, Spike Based Robot Control through Address-Event-Representation
One field of the neuroscience is the neuroinformatic whose aim is to
develop auto-reconfigurable systems that mimic the human body and brain. In
this paper we present a neuro-inspired spike based mobile robot. From
commercial cheap vision sensors converted into spike information, through
spike filtering for object recognition, to spike based motor control models. A
two wheel mobile robot powered by DC motors can be autonomously
controlled to follow a line drown in the floor. This spike system has been
developed around the well-known Address-Event-Representation mechanism to
communicate the different neuro-inspired layers of the system. RTC lab has
developed all the components presented in this work, from the vision sensor, to
the robot platform and the FPGA based platforms for AER processing.Ministerio de Ciencia e Innovación TEC2006-11730-C03-02Junta de Andalucía P06-TIC-0141
A USB3.0 FPGA Event-based Filtering and Tracking Framework for Dynamic Vision Sensors
Dynamic vision sensors (DVS) are frame-free sensors
with an asynchronous variable-rate output that is ideal for hard
real-time dynamic vision applications under power and latency
constraints. Post-processing of the digital sensor output can
reduce sensor noise, extract low level features, and track objects
using simple algorithms that have previously been implemented
in software. In this paper we present an FPGA-based framework
for event-based processing that allows uncorrelated-event noise
removal and real-time tracking of multiple objects, with dynamic
capabilities to adapt itself to fast or slow and large or small
objects. This framework uses a new hardware platform based on
a Lattice FPGA which filters the sensor output and which then
transmits the results through a super-speed Cypress FX3 USB
microcontroller interface to a host computer. The packets of
events and timestamps are transmitted to the host computer at
rates of 10 Mega events per second. Experimental results are
presented that demonstrate a low latency of 10us for tracking
and computing the center of mass of a detected object.Ministerio de Economía y Competitividad TEC2012-37868-C04-0
End-to-End Learning of Representations for Asynchronous Event-Based Data
Event cameras are vision sensors that record asynchronous streams of
per-pixel brightness changes, referred to as "events". They have appealing
advantages over frame-based cameras for computer vision, including high
temporal resolution, high dynamic range, and no motion blur. Due to the sparse,
non-uniform spatiotemporal layout of the event signal, pattern recognition
algorithms typically aggregate events into a grid-based representation and
subsequently process it by a standard vision pipeline, e.g., Convolutional
Neural Network (CNN). In this work, we introduce a general framework to convert
event streams into grid-based representations through a sequence of
differentiable operations. Our framework comes with two main advantages: (i)
allows learning the input event representation together with the task dedicated
network in an end to end manner, and (ii) lays out a taxonomy that unifies the
majority of extant event representations in the literature and identifies novel
ones. Empirically, we show that our approach to learning the event
representation end-to-end yields an improvement of approximately 12% on optical
flow estimation and object recognition over state-of-the-art methods.Comment: To appear at ICCV 201
Spiking row-by-row FPGA Multi-kernel and Multi-layer Convolution Processor.
Spiking convolutional neural networks have become
a novel approach for machine vision tasks, due to the latency
to process an input stimulus from a scene, and the low power
consumption of these kind of solutions. Event-based systems only
perform sum operations instead of sum of products of framebased
systems. In this work an upgrade of a neuromorphic
event-based convolution accelerator for SCNN, which is able to
perform multiple layers with different kernel sizes, is presented.
The system has a latency per layer from 1.44 μs to 9.98μs for
kernel sizes from 1x1 to 7x7
CED: Color Event Camera Dataset
Event cameras are novel, bio-inspired visual sensors, whose pixels output
asynchronous and independent timestamped spikes at local intensity changes,
called 'events'. Event cameras offer advantages over conventional frame-based
cameras in terms of latency, high dynamic range (HDR) and temporal resolution.
Until recently, event cameras have been limited to outputting events in the
intensity channel, however, recent advances have resulted in the development of
color event cameras, such as the Color-DAVIS346. In this work, we present and
release the first Color Event Camera Dataset (CED), containing 50 minutes of
footage with both color frames and events. CED features a wide variety of
indoor and outdoor scenes, which we hope will help drive forward event-based
vision research. We also present an extension of the event camera simulator
ESIM that enables simulation of color events. Finally, we present an evaluation
of three state-of-the-art image reconstruction methods that can be used to
convert the Color-DAVIS346 into a continuous-time, HDR, color video camera to
visualise the event stream, and for use in downstream vision applications.Comment: Conference on Computer Vision and Pattern Recognition Workshop
- …
