610 research outputs found

    Event-based Vision: A Survey

    Get PDF
    Event cameras are bio-inspired sensors that differ from conventional frame cameras: Instead of capturing images at a fixed rate, they asynchronously measure per-pixel brightness changes, and output a stream of events that encode the time, location and sign of the brightness changes. Event cameras offer attractive properties compared to traditional cameras: high temporal resolution (in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low power consumption, and high pixel bandwidth (on the order of kHz) resulting in reduced motion blur. Hence, event cameras have a large potential for robotics and computer vision in challenging scenarios for traditional cameras, such as low-latency, high speed, and high dynamic range. However, novel methods are required to process the unconventional output of these sensors in order to unlock their potential. This paper provides a comprehensive overview of the emerging field of event-based vision, with a focus on the applications and the algorithms developed to unlock the outstanding properties of event cameras. We present event cameras from their working principle, the actual sensors that are available and the tasks that they have been used for, from low-level vision (feature detection and tracking, optic flow, etc.) to high-level vision (reconstruction, segmentation, recognition). We also discuss the techniques developed to process events, including learning-based techniques, as well as specialized processors for these novel sensors, such as spiking neural networks. Additionally, we highlight the challenges that remain to be tackled and the opportunities that lie ahead in the search for a more efficient, bio-inspired way for machines to perceive and interact with the world

    DART: Distribution Aware Retinal Transform for Event-based Cameras

    Full text link
    We introduce a generic visual descriptor, termed as distribution aware retinal transform (DART), that encodes the structural context using log-polar grids for event cameras. The DART descriptor is applied to four different problems, namely object classification, tracking, detection and feature matching: (1) The DART features are directly employed as local descriptors in a bag-of-features classification framework and testing is carried out on four standard event-based object datasets (N-MNIST, MNIST-DVS, CIFAR10-DVS, NCaltech-101). (2) Extending the classification system, tracking is demonstrated using two key novelties: (i) For overcoming the low-sample problem for the one-shot learning of a binary classifier, statistical bootstrapping is leveraged with online learning; (ii) To achieve tracker robustness, the scale and rotation equivariance property of the DART descriptors is exploited for the one-shot learning. (3) To solve the long-term object tracking problem, an object detector is designed using the principle of cluster majority voting. The detection scheme is then combined with the tracker to result in a high intersection-over-union score with augmented ground truth annotations on the publicly available event camera dataset. (4) Finally, the event context encoded by DART greatly simplifies the feature correspondence problem, especially for spatio-temporal slices far apart in time, which has not been explicitly tackled in the event-based vision domain.Comment: 12 pages, revision submitted to TPAMI in Nov 201

    PCA-RECT: An Energy-efficient Object Detection Approach for Event Cameras

    Full text link
    We present the first purely event-based, energy-efficient approach for object detection and categorization using an event camera. Compared to traditional frame-based cameras, choosing event cameras results in high temporal resolution (order of microseconds), low power consumption (few hundred mW) and wide dynamic range (120 dB) as attractive properties. However, event-based object recognition systems are far behind their frame-based counterparts in terms of accuracy. To this end, this paper presents an event-based feature extraction method devised by accumulating local activity across the image frame and then applying principal component analysis (PCA) to the normalized neighborhood region. Subsequently, we propose a backtracking-free k-d tree mechanism for efficient feature matching by taking advantage of the low-dimensionality of the feature representation. Additionally, the proposed k-d tree mechanism allows for feature selection to obtain a lower-dimensional dictionary representation when hardware resources are limited to implement dimensionality reduction. Consequently, the proposed system can be realized on a field-programmable gate array (FPGA) device leading to high performance over resource ratio. The proposed system is tested on real-world event-based datasets for object categorization, showing superior classification performance and relevance to state-of-the-art algorithms. Additionally, we verified the object detection method and real-time FPGA performance in lab settings under non-controlled illumination conditions with limited training data and ground truth annotations.Comment: Accepted in ACCV 2018 Workshops, to appea

    A sub-mW IoT-endnode for always-on visual monitoring and smart triggering

    Full text link
    This work presents a fully-programmable Internet of Things (IoT) visual sensing node that targets sub-mW power consumption in always-on monitoring scenarios. The system features a spatial-contrast 128x64128\mathrm{x}64 binary pixel imager with focal-plane processing. The sensor, when working at its lowest power mode (10μW10\mu W at 10 fps), provides as output the number of changed pixels. Based on this information, a dedicated camera interface, implemented on a low-power FPGA, wakes up an ultra-low-power parallel processing unit to extract context-aware visual information. We evaluate the smart sensor on three always-on visual triggering application scenarios. Triggering accuracy comparable to RGB image sensors is achieved at nominal lighting conditions, while consuming an average power between 193μW193\mu W and 277μW277\mu W, depending on context activity. The digital sub-system is extremely flexible, thanks to a fully-programmable digital signal processing engine, but still achieves 19x lower power consumption compared to MCU-based cameras with significantly lower on-board computing capabilities.Comment: 11 pages, 9 figures, submitteted to IEEE IoT Journa

    Neuromorphic-P2M: Processing-in-Pixel-in-Memory Paradigm for Neuromorphic Image Sensors

    Full text link
    Edge devices equipped with computer vision must deal with vast amounts of sensory data with limited computing resources. Hence, researchers have been exploring different energy-efficient solutions such as near-sensor processing, in-sensor processing, and in-pixel processing, bringing the computation closer to the sensor. In particular, in-pixel processing embeds the computation capabilities inside the pixel array and achieves high energy efficiency by generating low-level features instead of the raw data stream from CMOS image sensors. Many different in-pixel processing techniques and approaches have been demonstrated on conventional frame-based CMOS imagers, however, the processing-in-pixel approach for neuromorphic vision sensors has not been explored so far. In this work, we for the first time, propose an asynchronous non-von-Neumann analog processing-in-pixel paradigm to perform convolution operations by integrating in-situ multi-bit multi-channel convolution inside the pixel array performing analog multiply and accumulate (MAC) operations that consume significantly less energy than their digital MAC alternative. To make this approach viable, we incorporate the circuit's non-ideality, leakage, and process variations into a novel hardware-algorithm co-design framework that leverages extensive HSpice simulations of our proposed circuit using the GF22nm FD-SOI technology node. We verified our framework on state-of-the-art neuromorphic vision sensor datasets and show that our solution consumes ~2x lower backend-processor energy while maintaining almost similar front-end (sensor) energy on the IBM DVS128-Gesture dataset than the state-of-the-art while maintaining a high test accuracy of 88.36%.Comment: 17 pages, 11 figures, 2 table

    NimbleAI: towards neuromorphic sensing-processing 3D-integrated chips

    Get PDF
    The NimbleAI Horizon Europe project leverages key principles of energy-efficient visual sensing and processing in biological eyes and brains, and harnesses the latest advances in 33D stacked silicon integration, to create an integral sensing-processing neuromorphic architecture that efficiently and accurately runs computer vision algorithms in area-constrained endpoint chips. The rationale behind the NimbleAI architecture is: sense data only with high information value and discard data as soon as they are found not to be useful for the application (in a given context). The NimbleAI sensing-processing architecture is to be specialized after-deployment by tunning system-level trade-offs for each particular computer vision algorithm and deployment environment. The objectives of NimbleAI are: (1) 100x performance per mW gains compared to state-of-the-practice solutions (i.e., CPU/GPUs processing frame-based video); (2) 50x processing latency reduction compared to CPU/GPUs; (3) energy consumption in the order of tens of mWs; and (4) silicon area of approx. 50 mm 2 .NimbleAI has received funding from the EU’s Horizon Europe Research and Innovation programme (Grant Agreement 101070679), and by the UK Research and Innovation (UKRI) under the UK government’s Horizon Europe funding guarantee (Grant Agreement 10039070)Peer ReviewedArticle signat per 49 autors/es: Xabier Iturbe, IKERLAN, Basque Country (Spain); Nassim Abderrahmane, MENTA, France; Jaume Abella, Barcelona Supercomputing Center (BSC), Catalonia, Spain; Sergi Alcaide, Barcelona Supercomputing Center (BSC), Catalonia, Spain; Eric Beyne, IMEC, Belgium; Henri-Pierre Charles, CEA-LIST, University Grenoble Alpes, France; Christelle Charpin-Nicolle, CEALETI, Univ. Grenoble Alpes, France; Lars Chittka, Queen Mary University of London, UK; Angélica Dávila, IKERLAN, Basque Country (Spain); Arne Erdmann, Raytrix, Germany; Carles Estrada, IKERLAN, Basque Country (Spain); Ander Fernández, IKERLAN, Basque Country (Spain); Anna Fontanelli, Monozukuri (MZ Technologies), Italy; José Flich, Universitat Politecnica de Valencia, Spain; Gianluca Furano, ESA ESTEC, Netherlands; Alejandro Hernán Gloriani, Viewpointsystem, Austria; Erik Isusquiza, ULMA Medical Technologies, Basque Country (Spain); Radu Grosu, TU Wien, Austria; Carles Hernández, Universitat Politecnica de Valencia, Spain; Daniele Ielmini, Politecnico Milano, Italy; David Jackson, University of Manchester, UK; Maha Kooli, CEA-LIST, University Grenoble Alpes, France; Nicola Lepri, Politecnico Milano, Italy; Bernabé Linares-Barranco, CSIC, Spain; Jean-Loup Lachese, MENTA, France; Eric Laurent, MENTA, France; Menno Lindwer, GrAI Matter Labs (GML), Netherlands; Frank Linsenmaier, Viewpointsystem, Austria; Mikel Luján, University of Manchester, UK; Karel Masařík, CODASIP, Czech Republic; Nele Mentens, Universiteit Leiden, Netherlands; Orlando Moreira, GrAI Matter Labs (GML), Netherlands; Chinmay Nawghane, IMEC, Belgium; Luca Peres, University of Manchester, UK; Jean-Philippe Noel, CEA-LIST, University Grenoble Alpes, France; Arash Pourtaherian, GrAI Matter Labs (GML), Netherlands; Christoph Posch, PROPHESEE, France; Peter Priller, AVL List, Austria; Zdenek Prikryl, CODASIP, Czech Republic; Felix Resch, TU Wien, Austria; Oliver Rhodes, University of Manchester, UK; Todor Stefanov, Universiteit Leiden, Netherlands; Moritz Storring, IMEC, Belgium; Michele Taliercio, Monozukuri (MZ Technologies), Italy; Rafael Tornero, Universitat Politecnica de Valencia, Spain; Marcel van de Burgwal, IMEC, Belgium; Geert van der Plas, IMEC, Belgium; Elisa Vianello, CEALETI, Univ. Grenoble Alpes, France; Pavel Zaykov, CODASIP, Czech RepublicPostprint (author's final draft
    corecore