610 research outputs found
Event-based Vision: A Survey
Event cameras are bio-inspired sensors that differ from conventional frame
cameras: Instead of capturing images at a fixed rate, they asynchronously
measure per-pixel brightness changes, and output a stream of events that encode
the time, location and sign of the brightness changes. Event cameras offer
attractive properties compared to traditional cameras: high temporal resolution
(in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low
power consumption, and high pixel bandwidth (on the order of kHz) resulting in
reduced motion blur. Hence, event cameras have a large potential for robotics
and computer vision in challenging scenarios for traditional cameras, such as
low-latency, high speed, and high dynamic range. However, novel methods are
required to process the unconventional output of these sensors in order to
unlock their potential. This paper provides a comprehensive overview of the
emerging field of event-based vision, with a focus on the applications and the
algorithms developed to unlock the outstanding properties of event cameras. We
present event cameras from their working principle, the actual sensors that are
available and the tasks that they have been used for, from low-level vision
(feature detection and tracking, optic flow, etc.) to high-level vision
(reconstruction, segmentation, recognition). We also discuss the techniques
developed to process events, including learning-based techniques, as well as
specialized processors for these novel sensors, such as spiking neural
networks. Additionally, we highlight the challenges that remain to be tackled
and the opportunities that lie ahead in the search for a more efficient,
bio-inspired way for machines to perceive and interact with the world
DART: Distribution Aware Retinal Transform for Event-based Cameras
We introduce a generic visual descriptor, termed as distribution aware
retinal transform (DART), that encodes the structural context using log-polar
grids for event cameras. The DART descriptor is applied to four different
problems, namely object classification, tracking, detection and feature
matching: (1) The DART features are directly employed as local descriptors in a
bag-of-features classification framework and testing is carried out on four
standard event-based object datasets (N-MNIST, MNIST-DVS, CIFAR10-DVS,
NCaltech-101). (2) Extending the classification system, tracking is
demonstrated using two key novelties: (i) For overcoming the low-sample problem
for the one-shot learning of a binary classifier, statistical bootstrapping is
leveraged with online learning; (ii) To achieve tracker robustness, the scale
and rotation equivariance property of the DART descriptors is exploited for the
one-shot learning. (3) To solve the long-term object tracking problem, an
object detector is designed using the principle of cluster majority voting. The
detection scheme is then combined with the tracker to result in a high
intersection-over-union score with augmented ground truth annotations on the
publicly available event camera dataset. (4) Finally, the event context encoded
by DART greatly simplifies the feature correspondence problem, especially for
spatio-temporal slices far apart in time, which has not been explicitly tackled
in the event-based vision domain.Comment: 12 pages, revision submitted to TPAMI in Nov 201
PCA-RECT: An Energy-efficient Object Detection Approach for Event Cameras
We present the first purely event-based, energy-efficient approach for object
detection and categorization using an event camera. Compared to traditional
frame-based cameras, choosing event cameras results in high temporal resolution
(order of microseconds), low power consumption (few hundred mW) and wide
dynamic range (120 dB) as attractive properties. However, event-based object
recognition systems are far behind their frame-based counterparts in terms of
accuracy. To this end, this paper presents an event-based feature extraction
method devised by accumulating local activity across the image frame and then
applying principal component analysis (PCA) to the normalized neighborhood
region. Subsequently, we propose a backtracking-free k-d tree mechanism for
efficient feature matching by taking advantage of the low-dimensionality of the
feature representation. Additionally, the proposed k-d tree mechanism allows
for feature selection to obtain a lower-dimensional dictionary representation
when hardware resources are limited to implement dimensionality reduction.
Consequently, the proposed system can be realized on a field-programmable gate
array (FPGA) device leading to high performance over resource ratio. The
proposed system is tested on real-world event-based datasets for object
categorization, showing superior classification performance and relevance to
state-of-the-art algorithms. Additionally, we verified the object detection
method and real-time FPGA performance in lab settings under non-controlled
illumination conditions with limited training data and ground truth
annotations.Comment: Accepted in ACCV 2018 Workshops, to appea
A sub-mW IoT-endnode for always-on visual monitoring and smart triggering
This work presents a fully-programmable Internet of Things (IoT) visual
sensing node that targets sub-mW power consumption in always-on monitoring
scenarios. The system features a spatial-contrast binary
pixel imager with focal-plane processing. The sensor, when working at its
lowest power mode ( at 10 fps), provides as output the number of
changed pixels. Based on this information, a dedicated camera interface,
implemented on a low-power FPGA, wakes up an ultra-low-power parallel
processing unit to extract context-aware visual information. We evaluate the
smart sensor on three always-on visual triggering application scenarios.
Triggering accuracy comparable to RGB image sensors is achieved at nominal
lighting conditions, while consuming an average power between and
, depending on context activity. The digital sub-system is extremely
flexible, thanks to a fully-programmable digital signal processing engine, but
still achieves 19x lower power consumption compared to MCU-based cameras with
significantly lower on-board computing capabilities.Comment: 11 pages, 9 figures, submitteted to IEEE IoT Journa
Neuromorphic-P2M: Processing-in-Pixel-in-Memory Paradigm for Neuromorphic Image Sensors
Edge devices equipped with computer vision must deal with vast amounts of
sensory data with limited computing resources. Hence, researchers have been
exploring different energy-efficient solutions such as near-sensor processing,
in-sensor processing, and in-pixel processing, bringing the computation closer
to the sensor. In particular, in-pixel processing embeds the computation
capabilities inside the pixel array and achieves high energy efficiency by
generating low-level features instead of the raw data stream from CMOS image
sensors. Many different in-pixel processing techniques and approaches have been
demonstrated on conventional frame-based CMOS imagers, however, the
processing-in-pixel approach for neuromorphic vision sensors has not been
explored so far. In this work, we for the first time, propose an asynchronous
non-von-Neumann analog processing-in-pixel paradigm to perform convolution
operations by integrating in-situ multi-bit multi-channel convolution inside
the pixel array performing analog multiply and accumulate (MAC) operations that
consume significantly less energy than their digital MAC alternative. To make
this approach viable, we incorporate the circuit's non-ideality, leakage, and
process variations into a novel hardware-algorithm co-design framework that
leverages extensive HSpice simulations of our proposed circuit using the GF22nm
FD-SOI technology node. We verified our framework on state-of-the-art
neuromorphic vision sensor datasets and show that our solution consumes ~2x
lower backend-processor energy while maintaining almost similar front-end
(sensor) energy on the IBM DVS128-Gesture dataset than the state-of-the-art
while maintaining a high test accuracy of 88.36%.Comment: 17 pages, 11 figures, 2 table
NimbleAI: towards neuromorphic sensing-processing 3D-integrated chips
The NimbleAI Horizon Europe project leverages key principles of energy-efficient visual sensing and processing in biological eyes and brains, and harnesses the latest advances in 33D stacked silicon integration, to create an integral sensing-processing neuromorphic architecture that efficiently and accurately runs computer vision algorithms in area-constrained endpoint chips. The rationale behind the NimbleAI architecture is: sense data only with high information value and discard data as soon as they are found not to be useful for the application (in a given context). The NimbleAI sensing-processing architecture is to be specialized after-deployment by tunning system-level trade-offs for each particular computer vision algorithm and deployment environment. The objectives of NimbleAI are: (1) 100x performance per mW gains compared to state-of-the-practice solutions (i.e., CPU/GPUs processing frame-based video); (2) 50x processing latency reduction compared to CPU/GPUs; (3) energy consumption in the order of tens of mWs; and (4) silicon area of approx. 50 mm 2 .NimbleAI has received funding from the EU’s Horizon Europe Research and Innovation programme (Grant Agreement 101070679), and by the UK Research and Innovation (UKRI) under the UK government’s Horizon Europe funding guarantee (Grant Agreement 10039070)Peer ReviewedArticle signat per 49 autors/es: Xabier Iturbe, IKERLAN, Basque Country (Spain); Nassim Abderrahmane, MENTA, France; Jaume Abella, Barcelona Supercomputing Center (BSC), Catalonia, Spain; Sergi Alcaide, Barcelona Supercomputing Center (BSC), Catalonia, Spain; Eric Beyne, IMEC, Belgium; Henri-Pierre Charles, CEA-LIST, University Grenoble Alpes, France; Christelle Charpin-Nicolle, CEALETI, Univ. Grenoble Alpes, France; Lars Chittka, Queen Mary University of London, UK; Angélica Dávila, IKERLAN, Basque Country (Spain); Arne Erdmann, Raytrix, Germany; Carles Estrada, IKERLAN, Basque Country (Spain); Ander Fernández, IKERLAN, Basque Country (Spain); Anna Fontanelli, Monozukuri (MZ Technologies), Italy; José Flich, Universitat Politecnica de Valencia, Spain; Gianluca Furano, ESA ESTEC, Netherlands; Alejandro Hernán Gloriani, Viewpointsystem, Austria; Erik Isusquiza, ULMA Medical Technologies, Basque Country (Spain); Radu Grosu, TU Wien, Austria; Carles Hernández, Universitat Politecnica de Valencia, Spain; Daniele Ielmini, Politecnico Milano, Italy; David Jackson, University of Manchester, UK; Maha Kooli, CEA-LIST, University Grenoble Alpes, France; Nicola Lepri, Politecnico Milano, Italy; Bernabé Linares-Barranco, CSIC, Spain; Jean-Loup Lachese, MENTA, France; Eric Laurent, MENTA, France; Menno Lindwer, GrAI Matter Labs (GML), Netherlands; Frank Linsenmaier, Viewpointsystem, Austria; Mikel Luján, University of Manchester, UK; Karel Masařík, CODASIP, Czech Republic; Nele Mentens, Universiteit Leiden, Netherlands; Orlando Moreira, GrAI Matter Labs (GML), Netherlands; Chinmay Nawghane, IMEC, Belgium; Luca Peres, University of Manchester, UK; Jean-Philippe Noel, CEA-LIST, University Grenoble Alpes, France; Arash Pourtaherian, GrAI Matter Labs (GML), Netherlands; Christoph Posch, PROPHESEE, France; Peter Priller, AVL List, Austria; Zdenek Prikryl, CODASIP, Czech Republic; Felix Resch, TU Wien, Austria; Oliver Rhodes, University of Manchester, UK; Todor Stefanov, Universiteit Leiden, Netherlands; Moritz Storring, IMEC, Belgium; Michele Taliercio, Monozukuri (MZ Technologies), Italy; Rafael Tornero, Universitat Politecnica de Valencia, Spain; Marcel van de Burgwal, IMEC, Belgium; Geert van der Plas, IMEC, Belgium; Elisa Vianello, CEALETI, Univ. Grenoble Alpes, France; Pavel Zaykov, CODASIP, Czech RepublicPostprint (author's final draft
- …