127 research outputs found

    Multi-layered Spiking Neural Network with Target Timestamp Threshold Adaptation and STDP

    Full text link
    Spiking neural networks (SNNs) are good candidates to produce ultra-energy-efficient hardware. However, the performance of these models is currently behind traditional methods. Introducing multi-layered SNNs is a promising way to reduce this gap. We propose in this paper a new threshold adaptation system which uses a timestamp objective at which neurons should fire. We show that our method leads to state-of-the-art classification rates on the MNIST dataset (98.60%) and the Faces/Motorbikes dataset (99.46%) with an unsupervised SNN followed by a linear SVM. We also investigate the sparsity level of the network by testing different inhibition policies and STDP rules

    Unsupervised Visual Feature Learning with Spike-timing-dependent Plasticity: How Far are we from Traditional Feature Learning Approaches?

    Full text link
    Spiking neural networks (SNNs) equipped with latency coding and spike-timing dependent plasticity rules offer an alternative to solve the data and energy bottlenecks of standard computer vision approaches: they can learn visual features without supervision and can be implemented by ultra-low power hardware architectures. However, their performance in image classification has never been evaluated on recent image datasets. In this paper, we compare SNNs to auto-encoders on three visual recognition datasets, and extend the use of SNNs to color images. The analysis of the results helps us identify some bottlenecks of SNNs: the limits of on-center/off-center coding, especially for color images, and the ineffectiveness of current inhibition mechanisms. These issues should be addressed to build effective SNNs for image recognition

    S3TC: Spiking Separated Spatial and Temporal Convolutions with Unsupervised STDP-based Learning for Action Recognition

    Full text link
    Video analysis is a major computer vision task that has received a lot of attention in recent years. The current state-of-the-art performance for video analysis is achieved with Deep Neural Networks (DNNs) that have high computational costs and need large amounts of labeled data for training. Spiking Neural Networks (SNNs) have significantly lower computational costs (thousands of times) than regular non-spiking networks when implemented on neuromorphic hardware. They have been used for video analysis with methods like 3D Convolutional Spiking Neural Networks (3D CSNNs). However, these networks have a significantly larger number of parameters compared with spiking 2D CSNN. This, not only increases the computational costs, but also makes these networks more difficult to implement with neuromorphic hardware. In this work, we use CSNNs trained in an unsupervised manner with the Spike Timing-Dependent Plasticity (STDP) rule, and we introduce, for the first time, Spiking Separated Spatial and Temporal Convolutions (S3TCs) for the sake of reducing the number of parameters required for video analysis. This unsupervised learning has the advantage of not needing large amounts of labeled data for training. Factorizing a single spatio-temporal spiking convolution into a spatial and a temporal spiking convolution decreases the number of parameters of the network. We test our network with the KTH, Weizmann, and IXMAS datasets, and we show that S3TCs successfully extract spatio-temporal information from videos, while increasing the output spiking activity, and outperforming spiking 3D convolutions

    Drowsy Driver Detection System Using Eye Blink Patterns

    Get PDF
    International audienceThis paper presents an automatic drowsy driver monitoring and accident prevention system that is based on monitoring the changes in the eye blink duration. Our proposed method detects visual changes in eye locations using the proposed horizontal symmetry feature of the eyes. Our new method detects eye blinks via a standard webcam in real-time at 110fps for a 320Ă—240 resolution. Experimental results in the JZU [3] eye-blink database showed that the proposed system detects eye blinks with a 94% accuracy with a 1% false positive rate

    Automatic Facial Feature Detection for Facial Expression Recognition

    Get PDF
    International audienceThis paper presents a real-time automatic facial feature point detection method for facial expression recognition. The system is capable of detecting seven facial feature points (eyebrows, pupils, nose, and corners of mouth) in grayscale images extracted from a given video. Extracted feature points then used for facial expression recognition. Neutral, happiness and surprise emotions have been studied on the Bosphorus dataset and tested on FG-NET video dataset using OpenCV. We compared our results with previous studies on this dataset. Our experiments showed that proposed method has the advantage of locating facial feature points automatically and accurately in real-time

    Positive/Negative Emotion Detection from RGB-D upper Body Images

    Get PDF
    International audienceThe ability to identify users'mental states represents a valu-able asset for improving human-computer interaction. Considering that spontaneous emotions are conveyed mostly through facial expressions and the upper Body movements, we propose to use these modalities together for the purpose of negative/positive emotion classification. A method that allows the recognition of mental states from videos is pro-posed. Based on a dataset composed with RGB-D movies a set of indic-tors of positive and negative is extracted from 2D (RGB) information. In addition, a geometric framework to model the depth flows and capture human body dynamics from depth data is proposed. Due to temporal changes in pixel and depth intensity which characterize spontaneous emo-tions dataset, the depth features are used to define the relation between changes in upper body movements and the affect. We describe a space of depth and texture information to detect the mood of people using upper body postures and their evolution across time. The experimentation has been performed on Cam3D dataset and has showed promising results

    IPTV 2.0 from Triple Play to social TV

    Get PDF
    International audienceThe great success of social technologies is transforming the Internet into a collaborative community. With a vision of IPTV 2.0, this paper presents our research work towards the exploitation of social phenomena in the domain of TV. Based on the advantage of IP Multimedia Subsystem (IMS) service architecture, the current IPTV service is extended from two aspects: TV-enriched communication and sociability-enhanced TV. Two applications namely TV Buddy and Social Electronic Program Guide (EPG) are proposed to demonstrate them respectively. Finally, we developed a prototype system on Ericsson IMS Software Development Studio (SDS)

    MuMIE : Une approche automatique pour l'interopérabilité des métadonnées

    Get PDF
    National audienceAvec l'explosion du multimedia, l'utilisation des métadonnées est de-venue cruciale afin d'assurer une bonne gestion des contenus. Cependant, il est nécessaire d'y assurer un accès uniforme aux métadonnées. Plusieurs techniques ont ainsi été développées afin de réaliser l'interopérabilité. La plupart d'entre elles sont spécifiques à un seul langage de description. Les systèmes de matching existants présentent certaines limitations, notamment dans le traitement des informations structurelles. Nous présentons dans cet article un nouveau sys-tème d'intégration qui supporte des schémas provenant de langages descriptifs différents. De plus, la méthode de matching proposée a recours à plusieurs types d'information de façon à augmenter la précision de l'intégration

    Construction de masques faciaux pour améliorer la reconnaissance d'expressions

    Get PDF
    National audienceCe travail propose une méthode pour détecter de manière automatique les régions qui contribuent le plus à une bonne classification des visages par rapport à des expressions prédéfinies : joie, surprise, etc. Notre méthode détermine les régions ayant le plus, (respectivement le moins) de pouvoir discriminant en utilisant un réseau de neurones de type MultiLayer Perceptron (MLP). A partir de régions de formes et de tailles quelconques, nous créons des masques à appliquer aux images avant de les classifier. Ces masques éliminent les zones de visages non pertinentes pour le processus de classification, en augmentant ainsi la performance du système. Nous avons conduit des expériences sur les bases d'images FERET, GENKI et JAFFE. Les résultats montrent une augmentation du taux de classification en utilisant les masques désignant les pixels d'intérêt

    Vers une Interopérabilité Multi-Niveaux des Métadonnées

    Get PDF
    National audienceSeveral matching techniques have been proposed in recent decades to achieve meta-data interoperability. However, most of the techniques focus on matching in the lowest level (schema level) taking into account only schemas defined using one description language which partially solve the problem of heterogeneity. In this context, we propose in this paper, a new matching approach, named MuMIe (Multi-level Metadata Integration), it aims at achieving interoperability on both levels (schema and description language). The proposed technique transforms schemas from different description languages in directed labeled graphs capturing only the basic concepts. A methodology for matching is then performed in the lower level to find the mapping between graph nodes, this is done by using several structural and semantic information. Our experimental results demonstrate the performances the proposed system.Plusieurs techniques de matching ont été proposées ces dernières décennies permet-tant la réalisation de l'interopérabilité des métadonnées. Toutefois, la plupart de ces techniques se concentrent sur le matching au bas niveau (niveau schéma) en prenant en compte les schémas provenant d'un seul langage de description, ce qui résout partiellement le problème d'hétérogé-néité. Dans ce contexte, nous proposons, une nouvelle approche de matching, baptisée MuMIe (Multi-level Metadata Integration). Elle a pour but de réaliser l'interopérabilité sur les deux niveaux (langage et schéma de description). La technique proposée transforme les schémas provenant de différents langages en graphes, en capturant uniquement les concepts basiques. Une méthodologie de matching est ensuite effectuée dans le bas niveau, permettant de trou-ver les correspondances entre les noeuds des graphes, via l'utilisation de plusieurs informations sémantiques et structurelles. Les expérimentations effectuées montrent les performances du sys-tème proposé
    • …
    corecore