4 research outputs found

    Event-based Vision: A Survey

    Get PDF
    Event cameras are bio-inspired sensors that differ from conventional frame cameras: Instead of capturing images at a fixed rate, they asynchronously measure per-pixel brightness changes, and output a stream of events that encode the time, location and sign of the brightness changes. Event cameras offer attractive properties compared to traditional cameras: high temporal resolution (in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low power consumption, and high pixel bandwidth (on the order of kHz) resulting in reduced motion blur. Hence, event cameras have a large potential for robotics and computer vision in challenging scenarios for traditional cameras, such as low-latency, high speed, and high dynamic range. However, novel methods are required to process the unconventional output of these sensors in order to unlock their potential. This paper provides a comprehensive overview of the emerging field of event-based vision, with a focus on the applications and the algorithms developed to unlock the outstanding properties of event cameras. We present event cameras from their working principle, the actual sensors that are available and the tasks that they have been used for, from low-level vision (feature detection and tracking, optic flow, etc.) to high-level vision (reconstruction, segmentation, recognition). We also discuss the techniques developed to process events, including learning-based techniques, as well as specialized processors for these novel sensors, such as spiking neural networks. Additionally, we highlight the challenges that remain to be tackled and the opportunities that lie ahead in the search for a more efficient, bio-inspired way for machines to perceive and interact with the world

    Applications of Silicon Retinas: from Neuroscience to Computer Vision

    Full text link
    Traditional visual sensor technology is firmly rooted in the concept of sequences of image frames. The sequence of stroboscopic images in these "frame cameras" is very different compared to the information running from the retina to the visual cortex. While conventional cameras have improved in the direction of smaller pixels and higher frame rates, the basics of image acquisition have remained the same. Event-based vision sensors were originally known as "silicon retinas" but are now widely called "event cameras." They are a new type of vision sensors that take inspiration from the mechanisms developed by nature for the mammalian retina and suggest a different way of perceiving the world. As in the neural system, the sensed information is encoded in a train of spikes, or so-called events, comparable to the action potential generated in the nerve. Event-based sensors produce sparse and asynchronous output that represents in- formative changes in the scene. These sensors have advantages in terms of fast response, low latency, high dynamic range, and sparse output. All these char- acteristics are appealing for computer vision and robotic applications, increasing the interest in this kind of sensor. However, since the sensor’s output is very dif- ferent, algorithms applied for frames need to be rethought and re-adapted. This thesis focuses on several applications of event cameras in scientific scenarios. It aims to identify where they can make the difference compared to frame cam- eras. The presented applications use the Dynamic Vision Sensor (event camera developed by the Sensors Group of the Institute of Neuroinformatics, University of Zurich and ETH). To explore some applications in more extreme situations, the first chapters of the thesis focus on the characterization of several advanced versions of the standard DVS. The low light condition represents a challenging situation for every vision sensor. Taking inspiration from standard Complementary Metal Oxide Semiconductor (CMOS) technology, the DVS pixel performances in a low light scenario can be improved, increasing sensitivity and quantum efficiency, by using back-side illumination. This thesis characterizes the so-called Back Side Illumination DAVIS (BSI DAVIS) camera and shows results from its application in calcium imaging of neural activity. The BSI DAVIS has shown better performance in the low light scene due to its high Quantum Efficiency (QE) of 93% and proved to be the best type of technology for microscopy application. The BSI DAVIS allows detecting fast dynamic changes in neural fluorescent imaging using the green fluorescent calcium indicator GCaMP6f. Event camera advances have pushed the exploration of event-based cameras in computer vision tasks. Chapters of this thesis focus on two of the most active research areas in computer vision: human pose estimation and hand gesture classification. Both chapters report the datasets collected to achieve the task, fulfilling the continuous need for data for this kind of new technology. The Dynamic Vision Sensor Human Pose dataset (DHP19) is an extensive collection of 33 whole-body human actions from 17 subjects. The chapter presents the first benchmark neural network model for 3D pose estimation using DHP19. The network archives a mean error of less than 8 mm in the 3D space, which is comparable with frame-based Human Pose Estimation (HPE) methods using frames. The gesture classification chapter reports an application running on a mobile device and explores future developments in the direction of embedded portable low power devices for online processing. The sparse output from the sensor suggests using a small model with a reduced number of parameters and low power consumption. The thesis also describes pilot results from two other scientific imaging applica- tions for raindrop size measurement and laser speckle analysis presented in the appendices

    Extraction d'informations spatio-temporelles du mouvement avec un réseau de neurones à spike : application sur une tâche de prédiction de trajectoire de balles

    Get PDF
    L'intelligence artificielle regroupe l'ensemble des techniques et théories permettant à un système artificiel d'agir de façon dite intelligente, c'est-à-dire comme un humain ou un animal le ferait. Ce domaine de recherche a fait récemment l'objet de grandes avancées qui ont permis de surpasser de nombreuses méthodes de traitement de l'information. Plus particulièrement pour la catégorie des réseaux de neurones qui ont fait preuve de performances impressionnantes et d'une forte capacité à s'adapter à différents cas d'usages. Ces réseaux de neurones cherchent à imiter le fonctionnement des neurones du cerveau humain et peuvent apprendre des données qui leur sont fournies pour répondre à différents problèmes. Cet attrait pour ces méthodes s'est notamment amplifié ces dernières années avec l'arrivée des réseaux de neurones artificiels profonds et notamment dans le domaine du traitement de l'information visuelle avec les réseaux de neurones convolutionnels. Cependant, la quête de puissance des DNNs et CNNs les ont amenée à s'éloigner de leur inspiration biologique initiale. Ainsi, les réseaux de neurones artificiels ne ressemblent désormais plus au fonctionnement du cerveau humain, consomment beaucoup d'énergie et nécessitent de nombreuses données pour les entraîner. Une réponse à ces différents problèmes a été proposée et celle-ci vient encore une fois du biomimétisme. En effet, celle-ci consiste à s'inspirer de la méthode de communication de l'information des neurones. Les réseaux de neurones à spike se sont alors développés. Considérés comme la troisième génération de réseaux de neurones, ils utilisent les spikes pour transmettre l'information entre les neurones, qui sont des événements binaires et inscrits temporellement, contrairement aux ANNs qui utilisaient des valeurs analogiques et continues. Le développement des SNNs s'est accompagné de l'utilisation de la règle d'apprentissage Spike-Timing-Dependent-Plasticity (STDP), qui a pour particularité d'être bio-inspirée, d'utiliser l'information temporelle contenue dans les spikes et d'être non-supervisée. De nouveaux capteurs et cartes neuromorphiques ont ensuite émergé permettant de profiter pleinement des avantages présentés par les SNNs et par le traitement de l'information événementielle de façon plus générale. De récentes études ont montré que les SNNs pouvaient extraire de l'information à partir d'information visuelle sur des images ou en utilisant des caméras événementielles. L'objectif de cette thèse est d'utiliser et d'adapter une nouvelle caméra événementielle développée par l'entreprise Yumain à Dijon pour effectuer de l'analyse et de la prédiction de trajectoires de balles dans un contexte sportif. En effet, la capacité de prédiction est primordiale dans le sport afin d'anticiper le comportement de l'adversaire ou la trajectoire d'un ballon. Les experts sportifs notamment, présentent de fortes capacités de prédiction dans leur sport de prédilection. Ainsi l'objectif est d'évaluer la capacité de notre solution à prédire la trajectoire d'un ballon et de comparer les performances avec des participants experts ou non en sport de balle, nécessitant une anticipation de la trajectoire de la balle. Ce manuscrit de thèse reviendra donc dans un premier temps sur l'état de l'art actuel des traitements événementiels, dont notamment les SNNs et les caméras asynchrones événementielles. La caméra NeuroSoc sera ensuite présentée, ainsi que le SNN utilisé durant cette thèse. Puis nous montrerons que ce type de SNN permet d'extraire de l'information à partir de spikes générés par de l'information visuelle, que ce soit sur des simulations ou depuis des acquisitions réelles. Ainsi, nous montrerons que l'information extraite permet de prédire la trajectoire de la balle et que cette prédiction est plus précise que celles effectuées par des participants humains. Enfin, quelques limites de la caméra NeuroSoc utilisée durant cette thèse seront exposés, ainsi que des propositions pour améliorer cette dernière.Artificial intelligence covers all the techniques and theories that enable an artificial system to act in a so-called intelligent way, i.e., as a human or an animal would. This field of research has recently made significant progress which have enabled it to surpass many information processing methods. Specifically, the category of artificial neural networks (ANN) has shown impressive capabilities and strong abilities to adapt to different use cases. These neural networks seek to mimic the processing of neurons in the human brain and can learn from provided data to solve various tasks. This attraction for these methods has notably increased in recent years with the arrival of deep neural networks (DNN), notably in visual information processing with convolutional neural networks (CNN). However, the methods currently used by DNNs and CNNs no longer mimic the human brain's processing, use a lot of energy, and require a lot of data to train them. An answer to these different limits has been proposed. This one comes from bio-mimicry, once again, as it consists of being inspired by the information communication method of neurons. Spiking neural networks (SNN) were then developed. Considered as the third generation of neural networks, they use spikes (action potentials) to transmit information between neurons, which are binary events with a temporal information, as opposed to ANNs that used analog and continuous values. The development of SNNs was accompanied by the spike timing dependant plasticity (STDP) learning rule. This one has the particularity of being bio-inspired, using the temporal information contained in the spikes and being unsupervised. New sensors and neuromorphic chips have emerged to take full advantage of the benefits offered by SNNs and event-driven information processing more generally. Recent studies have shown that SNNs can extract information from visual stimuli on images or by using event-driven cameras. The objective of this thesis is to use and adapt a new event-driven camera developed by the company Yumain, located in Dijon, to perform analysis and prediction of ball trajectories in a sports context. Indeed, the ability of prediction is essential while practicing sport in order to anticipate the opponent's behavior or the trajectory of a ball. Sports experts, particularly, have strong predictive skills in their favorite sport. Thus the objective is to evaluate the capacity of our solution on the ball's trajectory prediction task and compare the performances with expert and/or non-expert participants. Fisrt of all, this thesis manuscript will review the current state of the art of event-driven processing, including event-driven cameras and asynchronous cameras. Then, the NeuroSoc camera will be presented, along with the SNN used during this thesis. Afterward, we will show that this type of camera allows extracting information from spikes generated by visual information, either on simulations or from real acquisitions. Thus, we will show that the extracted information allows predicting the ball's trajectory and that this prediction is more accurate than those made by human participants. Finally, some limitations of the NeuroSoc camera used during this thesis will be presented, along with some proposals to improve it
    corecore