13 research outputs found

    Applications of Silicon Retinas: from Neuroscience to Computer Vision

    Full text link
    Traditional visual sensor technology is firmly rooted in the concept of sequences of image frames. The sequence of stroboscopic images in these "frame cameras" is very different compared to the information running from the retina to the visual cortex. While conventional cameras have improved in the direction of smaller pixels and higher frame rates, the basics of image acquisition have remained the same. Event-based vision sensors were originally known as "silicon retinas" but are now widely called "event cameras." They are a new type of vision sensors that take inspiration from the mechanisms developed by nature for the mammalian retina and suggest a different way of perceiving the world. As in the neural system, the sensed information is encoded in a train of spikes, or so-called events, comparable to the action potential generated in the nerve. Event-based sensors produce sparse and asynchronous output that represents in- formative changes in the scene. These sensors have advantages in terms of fast response, low latency, high dynamic range, and sparse output. All these char- acteristics are appealing for computer vision and robotic applications, increasing the interest in this kind of sensor. However, since the sensor’s output is very dif- ferent, algorithms applied for frames need to be rethought and re-adapted. This thesis focuses on several applications of event cameras in scientific scenarios. It aims to identify where they can make the difference compared to frame cam- eras. The presented applications use the Dynamic Vision Sensor (event camera developed by the Sensors Group of the Institute of Neuroinformatics, University of Zurich and ETH). To explore some applications in more extreme situations, the first chapters of the thesis focus on the characterization of several advanced versions of the standard DVS. The low light condition represents a challenging situation for every vision sensor. Taking inspiration from standard Complementary Metal Oxide Semiconductor (CMOS) technology, the DVS pixel performances in a low light scenario can be improved, increasing sensitivity and quantum efficiency, by using back-side illumination. This thesis characterizes the so-called Back Side Illumination DAVIS (BSI DAVIS) camera and shows results from its application in calcium imaging of neural activity. The BSI DAVIS has shown better performance in the low light scene due to its high Quantum Efficiency (QE) of 93% and proved to be the best type of technology for microscopy application. The BSI DAVIS allows detecting fast dynamic changes in neural fluorescent imaging using the green fluorescent calcium indicator GCaMP6f. Event camera advances have pushed the exploration of event-based cameras in computer vision tasks. Chapters of this thesis focus on two of the most active research areas in computer vision: human pose estimation and hand gesture classification. Both chapters report the datasets collected to achieve the task, fulfilling the continuous need for data for this kind of new technology. The Dynamic Vision Sensor Human Pose dataset (DHP19) is an extensive collection of 33 whole-body human actions from 17 subjects. The chapter presents the first benchmark neural network model for 3D pose estimation using DHP19. The network archives a mean error of less than 8 mm in the 3D space, which is comparable with frame-based Human Pose Estimation (HPE) methods using frames. The gesture classification chapter reports an application running on a mobile device and explores future developments in the direction of embedded portable low power devices for online processing. The sparse output from the sensor suggests using a small model with a reduced number of parameters and low power consumption. The thesis also describes pilot results from two other scientific imaging applica- tions for raindrop size measurement and laser speckle analysis presented in the appendices

    Microfabrication and chronic in-vivo study of an intrafascicular electrode for the Peripheral Nervous System

    Full text link
    In the present work we have microfabricated a neural interface for the Peripherl Nervous System (PNS). This new interface, SELINE, present three dimensional structures that reduce the movement of the elctrode inside the nerve. The electrode has been fabricated with microfabrication techniques, characterized in - vitro and tested in-vivo for a chronic study

    Sensor fusion using EMG and vision for hand gesture classification in mobile applications

    Get PDF
    Best demo AwardInternational audienceThe discrimination of human gestures using wearable solutions is extremely important as a supporting technique for assisted living, healthcare of the elderly and neurorehabilitation. This paper presents a mobile electromyography (EMG)analysis framework to be an auxiliary component in physiotherapy sessions or as a feedback for neuroprosthesis calibration. We implemented a framework that allows the integration of multisensors, EMG and visual information, to perform sensor fusion and to improve the accuracy of hand gesture recognition tasks. In particular, we used an event-based camera adapted to run on the limited computational resources of mobile phones. We introduced a new publicly available dataset of sensor fusion for hand gesture recognition recorded from 10 subjects and used it to train the recognition models offline. We compare the online results of the hand gesture recognition using the fusion approach with the individual sensors with an improvement in the accuracy of 13% and 11%, for EMG and vision respectively, reaching 85%

    Live Demonstration : Sensor fusion using EMG and vision for hand gesture classification in mobile applications

    Get PDF
    DémonstrationThe discrimination of human gestures using wearable solutions is extremely important as a supporting technique for assisted living, healthcare of the elderly and neurorehabilitation. This paper presents a mobile electromyography (EMG)analysis framework to be an auxiliary component in physiotherapy sessions or as a feedback for neuroprosthesis calibration. We implemented a framework that allows the integration of multisensors, EMG and visual information, to perform sensor fusion and to improve the accuracy of hand gesture recognition tasks. In particular, we used an event-based camera adapted to run on the limited computational resources of mobile phones. We introduced a new publicly available dataset of sensor fusion for hand gesture recognition recorded from 10 subjects and used it to train the recognition models offline. We compare the online results of the hand gesture recognition using the fusion approach with the individual sensors with an improvement in the accuracy of 13% and 11%, for EMG and vision respectively, reaching 85%

    Live Demonstration: Front and Back Illuminated Dynamic and Active Pixel Vision Sensor Comparison

    Full text link
    The demonstration shows the differences between two novel Dynamic and Active Pixel Vision Sensors (DAVIS). While both sensors are based on the same circuits and have the same resolution (346×260), they differ in their manufacturing. The first sensor is a DAVIS with standard Front Side Illuminated (FSI) technology and the second sensor is the first Back Side Illuminated (BSI) DAVIS sensor

    Front and Back Illuminated Dynamic and Active Pixel Vision Sensor Comparison

    Full text link
    Back side illumination has become standard image sensor technology owing to its superior quantum efficiency and fill factor. A direct comparison of front and back side illumination (FSI and BSI) used in event-based dynamic and active pixel vision sensors (DAVIS) is interesting because of the potential of BSI to greatly increase the small 20% fill factor of these complex pixels. This brief compares identically designed front and back illuminated DAVIS silicon retina vision sensors. They are compared in term of quantum efficiency (QE), leak activity and modulation transfer function (MTF). The BSI DAVIS achieves a peak QE of 93% compared with the FSI DAVIS, peak QE of 24%, but reduced MTF, due to pixel crosstalk and parasitic photocurrent. Significant “leak events” in the BSI DAVIS limit its use to controlled illumination scenarios without very bright light sources. Effects of parasitic photocurrent and modulation transfer functions with and without IR cut filters are also reported

    Hand-Gesture Recognition Based on EMG and Event-Based Camera Sensor Fusion: A Benchmark in Neuromorphic Computing.

    Full text link
    Hand gestures are a form of non-verbal communication used by individuals in conjunction with speech to communicate. Nowadays, with the increasing use of technology, hand-gesture recognition is considered to be an important aspect of Human-Machine Interaction (HMI), allowing the machine to capture and interpret the user's intent and to respond accordingly. The ability to discriminate between human gestures can help in several applications, such as assisted living, healthcare, neuro-rehabilitation, and sports. Recently, multi-sensor data fusion mechanisms have been investigated to improve discrimination accuracy. In this paper, we present a sensor fusion framework that integrates complementary systems: the electromyography (EMG) signal from muscles and visual information. This multi-sensor approach, while improving accuracy and robustness, introduces the disadvantage of high computational cost, which grows exponentially with the number of sensors and the number of measurements. Furthermore, this huge amount of data to process can affect the classification latency which can be crucial in real-case scenarios, such as prosthetic control. Neuromorphic technologies can be deployed to overcome these limitations since they allow real-time processing in parallel at low power consumption. In this paper, we present a fully neuromorphic sensor fusion approach for hand-gesture recognition comprised of an event-based vision sensor and three different neuromorphic processors. In particular, we used the event-based camera, called DVS, and two neuromorphic platforms, Loihi and ODIN + MorphIC. The EMG signals were recorded using traditional electrodes and then converted into spikes to be fed into the chips. We collected a dataset of five gestures from sign language where visual and electromyography signals are synchronized. We compared a fully neuromorphic approach to a baseline implemented using traditional machine learning approaches on a portable GPU system. According to the chip's constraints, we designed specific spiking neural networks (SNNs) for sensor fusion that showed classification accuracy comparable to the software baseline. These neuromorphic alternatives have increased inference time, between 20 and 40%, with respect to the GPU system but have a significantly smaller energy-delay product (EDP) which makes them between 30× and 600× more efficient. The proposed work represents a new benchmark that moves neuromorphic computing toward a real-world scenario

    DHP19: Dynamic Vision Sensor 3D Human Pose Dataset

    Full text link
    Human pose estimation has dramatically improved thanks to the continuous developments in deep learning. However, marker-free human pose estimation based on standard frame-based cameras is still slow and power hungry for real-time feedback interaction because of the huge number of operations necessary for large Convolutional Neural Network (CNN) inference. Event-based cameras such as the Dynamic Vision Sensor (DVS) quickly output sparse moving-edge information. Their sparse and rapid output is ideal for driving low-latency CNNs, thus potentially allowing real-time interaction for human pose estimators. Although the application of CNNs to standard frame-based cameras for human pose estimation is well established, their application to event-based cameras is still under study. This paper proposes a novel benchmark dataset of human body movements, the Dynamic Vision Sensor Human Pose dataset (DHP19). It consists of recordings from 4 synchronized 346x260 pixel DVS cameras, for a set of 33 movements with 17 subjects. DHP19 also includes a 3D pose estimation model that achieves an average 3D pose estimation error of about 8 cm, despite the sparse and reduced input data from the DVS

    In-vivo imaging of neural activity with dynamic vision sensors

    Full text link
    Optical recording of neural activity using calcium or voltage indicators requires cameras capable of detecting small temporal contrast in light intensity with sample rates of 10 Hz to 1 kHz. Large pixel scientific CMOS image sensors (sCMOS) are typically used due to their high resolution, high frame rate, and low noise. However, using such sensors for long-term recording is challenging due to their high data rates of up to 1 Gb/s. Here we studied the use of dynamic vision sensor (DVS) event cameras for neural recording. DVS have high dynamic range and a sparse asynchronous output consisting of brightness change events. Using DVS for neural recording could avoid transferring and storing redundant information. We compared the use of a Hamamatsu Orca V2 sCMOS with two advanced DVS sensors (a higher temporal contrast sensitivity 188×180 pixel SDAVIS and a 346×260 pixel higher light sensitivity back-side-illuminated BSIDAVIS) for neural activity recordings with fluorescent calcium indicators both in brain slices and awake mice. The DVS activity responds to the fast dynamics of neural activity, indicating that a sensor combining SDAVIS and BSIDAVIS technologies would be beneficial for long-term in-vivo neural recording using calcium indicators as well as potentially faster voltage indicators
    corecore