6 research outputs found
Active Perception with Dynamic Vision Sensors. Minimum Saccades with Optimum Recognition
Vision processing with Dynamic Vision Sensors
(DVS) is becoming increasingly popular. This type of bio-inspired
vision sensor does not record static scenes. DVS pixel activity
relies on changes in light intensity. In this paper, we introduce
a platform for object recognition with a DVS in which the
sensor is installed on a moving pan-tilt unit in closed-loop with
a recognition neural network. This neural network is trained
to recognize objects observed by a DVS while the pan-tilt unit
is moved to emulate micro-saccades. We show that performing
more saccades in different directions can result in having more
information about the object and therefore more accurate object
recognition is possible. However, in high performance and low latency
platforms, performing additional saccades adds additional
latency and power consumption. Here we show that the number
of saccades can be reduced while keeping the same recognition
accuracy by performing intelligent saccadic movements, in a
closed action-perception smart loop. We propose an algorithm
for smart saccadic movement decisions that can reduce the
number of necessary saccades to half, on average, for a predefined
accuracy on the N-MNIST dataset. Additionally, we show that
by replacing this control algorithm with an Artificial Neural
Network that learns to control the saccades, we can also reduce
to half the average number of saccades needed for N-MNIST
recognition.EU H2020 grant 644096 ECOMODEEU H2020 grant 687299 NEURAM3Ministry of Economy and Competitivity (Spain) / European Regional Development Fund TEC2015-63884-C2-1-P (COGNET
Event-based Vision: A Survey
Event cameras are bio-inspired sensors that differ from conventional frame
cameras: Instead of capturing images at a fixed rate, they asynchronously
measure per-pixel brightness changes, and output a stream of events that encode
the time, location and sign of the brightness changes. Event cameras offer
attractive properties compared to traditional cameras: high temporal resolution
(in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low
power consumption, and high pixel bandwidth (on the order of kHz) resulting in
reduced motion blur. Hence, event cameras have a large potential for robotics
and computer vision in challenging scenarios for traditional cameras, such as
low-latency, high speed, and high dynamic range. However, novel methods are
required to process the unconventional output of these sensors in order to
unlock their potential. This paper provides a comprehensive overview of the
emerging field of event-based vision, with a focus on the applications and the
algorithms developed to unlock the outstanding properties of event cameras. We
present event cameras from their working principle, the actual sensors that are
available and the tasks that they have been used for, from low-level vision
(feature detection and tracking, optic flow, etc.) to high-level vision
(reconstruction, segmentation, recognition). We also discuss the techniques
developed to process events, including learning-based techniques, as well as
specialized processors for these novel sensors, such as spiking neural
networks. Additionally, we highlight the challenges that remain to be tackled
and the opportunities that lie ahead in the search for a more efficient,
bio-inspired way for machines to perceive and interact with the world
Active perception with dynamic vision sensors. Minimum saccades with optimum recognition
Vision processing with dynamic vision sensors (DVSs) is becoming increasingly popular. This type of a bio-inspired vision sensor does not record static images. The DVS pixel activity relies on the changes in light intensity. In this paper, we introduce a platform for the object recognition with a DVS in which the sensor is installed on a moving pan-tilt unit in a closed loop with a recognition neural network. This neural network is trained to recognize objects observed by a DVS, while the pan-tilt unit is moved to emulate micro-saccades. We show that performing more saccades in different directions can result in having more information about the object, and therefore, more accurate object recognition is possible. However, in high-performance and low-latency platforms, performing additional saccades adds latency and power consumption. Here, we show that the number of saccades can be reduced while keeping the same recognition accuracy by performing intelligent saccadic movements, in a closed action-perception smart loop. We propose an algorithm for smart saccadic movement decisions that can reduce the number of necessary saccades to half, on average, for a predefined accuracy on the N-MNIST dataset. Additionally, we show that by replacing this control algorithm with an artificial neural network that learns to control the saccades, we can also reduce to half the average number of saccades needed for the N-MNIST recognition