127 research outputs found
Multi-layered Spiking Neural Network with Target Timestamp Threshold Adaptation and STDP
Spiking neural networks (SNNs) are good candidates to produce
ultra-energy-efficient hardware. However, the performance of these models is
currently behind traditional methods. Introducing multi-layered SNNs is a
promising way to reduce this gap. We propose in this paper a new threshold
adaptation system which uses a timestamp objective at which neurons should
fire. We show that our method leads to state-of-the-art classification rates on
the MNIST dataset (98.60%) and the Faces/Motorbikes dataset (99.46%) with an
unsupervised SNN followed by a linear SVM. We also investigate the sparsity
level of the network by testing different inhibition policies and STDP rules
Unsupervised Visual Feature Learning with Spike-timing-dependent Plasticity: How Far are we from Traditional Feature Learning Approaches?
Spiking neural networks (SNNs) equipped with latency coding and spike-timing
dependent plasticity rules offer an alternative to solve the data and energy
bottlenecks of standard computer vision approaches: they can learn visual
features without supervision and can be implemented by ultra-low power hardware
architectures. However, their performance in image classification has never
been evaluated on recent image datasets. In this paper, we compare SNNs to
auto-encoders on three visual recognition datasets, and extend the use of SNNs
to color images. The analysis of the results helps us identify some bottlenecks
of SNNs: the limits of on-center/off-center coding, especially for color
images, and the ineffectiveness of current inhibition mechanisms. These issues
should be addressed to build effective SNNs for image recognition
S3TC: Spiking Separated Spatial and Temporal Convolutions with Unsupervised STDP-based Learning for Action Recognition
Video analysis is a major computer vision task that has received a lot of
attention in recent years. The current state-of-the-art performance for video
analysis is achieved with Deep Neural Networks (DNNs) that have high
computational costs and need large amounts of labeled data for training.
Spiking Neural Networks (SNNs) have significantly lower computational costs
(thousands of times) than regular non-spiking networks when implemented on
neuromorphic hardware. They have been used for video analysis with methods like
3D Convolutional Spiking Neural Networks (3D CSNNs). However, these networks
have a significantly larger number of parameters compared with spiking 2D CSNN.
This, not only increases the computational costs, but also makes these networks
more difficult to implement with neuromorphic hardware. In this work, we use
CSNNs trained in an unsupervised manner with the Spike Timing-Dependent
Plasticity (STDP) rule, and we introduce, for the first time, Spiking Separated
Spatial and Temporal Convolutions (S3TCs) for the sake of reducing the number
of parameters required for video analysis. This unsupervised learning has the
advantage of not needing large amounts of labeled data for training.
Factorizing a single spatio-temporal spiking convolution into a spatial and a
temporal spiking convolution decreases the number of parameters of the network.
We test our network with the KTH, Weizmann, and IXMAS datasets, and we show
that S3TCs successfully extract spatio-temporal information from videos, while
increasing the output spiking activity, and outperforming spiking 3D
convolutions
Drowsy Driver Detection System Using Eye Blink Patterns
International audienceThis paper presents an automatic drowsy driver monitoring and accident prevention system that is based on monitoring the changes in the eye blink duration. Our proposed method detects visual changes in eye locations using the proposed horizontal symmetry feature of the eyes. Our new method detects eye blinks via a standard webcam in real-time at 110fps for a 320Ă—240 resolution. Experimental results in the JZU [3] eye-blink database showed that the proposed system detects eye blinks with a 94% accuracy with a 1% false positive rate
Automatic Facial Feature Detection for Facial Expression Recognition
International audienceThis paper presents a real-time automatic facial feature point detection method for facial expression recognition. The system is capable of detecting seven facial feature points (eyebrows, pupils, nose, and corners of mouth) in grayscale images extracted from a given video. Extracted feature points then used for facial expression recognition. Neutral, happiness and surprise emotions have been studied on the Bosphorus dataset and tested on FG-NET video dataset using OpenCV. We compared our results with previous studies on this dataset. Our experiments showed that proposed method has the advantage of locating facial feature points automatically and accurately in real-time
Positive/Negative Emotion Detection from RGB-D upper Body Images
International audienceThe ability to identify users'mental states represents a valu-able asset for improving human-computer interaction. Considering that spontaneous emotions are conveyed mostly through facial expressions and the upper Body movements, we propose to use these modalities together for the purpose of negative/positive emotion classification. A method that allows the recognition of mental states from videos is pro-posed. Based on a dataset composed with RGB-D movies a set of indic-tors of positive and negative is extracted from 2D (RGB) information. In addition, a geometric framework to model the depth flows and capture human body dynamics from depth data is proposed. Due to temporal changes in pixel and depth intensity which characterize spontaneous emo-tions dataset, the depth features are used to define the relation between changes in upper body movements and the affect. We describe a space of depth and texture information to detect the mood of people using upper body postures and their evolution across time. The experimentation has been performed on Cam3D dataset and has showed promising results
IPTV 2.0 from Triple Play to social TV
International audienceThe great success of social technologies is transforming the Internet into a collaborative community. With a vision of IPTV 2.0, this paper presents our research work towards the exploitation of social phenomena in the domain of TV. Based on the advantage of IP Multimedia Subsystem (IMS) service architecture, the current IPTV service is extended from two aspects: TV-enriched communication and sociability-enhanced TV. Two applications namely TV Buddy and Social Electronic Program Guide (EPG) are proposed to demonstrate them respectively. Finally, we developed a prototype system on Ericsson IMS Software Development Studio (SDS)
MuMIE : Une approche automatique pour l'interopérabilité des métadonnées
National audienceAvec l'explosion du multimedia, l'utilisation des métadonnées est de-venue cruciale afin d'assurer une bonne gestion des contenus. Cependant, il est nécessaire d'y assurer un accès uniforme aux métadonnées. Plusieurs techniques ont ainsi été développées afin de réaliser l'interopérabilité. La plupart d'entre elles sont spécifiques à un seul langage de description. Les systèmes de matching existants présentent certaines limitations, notamment dans le traitement des informations structurelles. Nous présentons dans cet article un nouveau sys-tème d'intégration qui supporte des schémas provenant de langages descriptifs différents. De plus, la méthode de matching proposée a recours à plusieurs types d'information de façon à augmenter la précision de l'intégration
Construction de masques faciaux pour améliorer la reconnaissance d'expressions
National audienceCe travail propose une méthode pour détecter de manière automatique les régions qui contribuent le plus à une bonne classification des visages par rapport à des expressions prédéfinies : joie, surprise, etc. Notre méthode détermine les régions ayant le plus, (respectivement le moins) de pouvoir discriminant en utilisant un réseau de neurones de type MultiLayer Perceptron (MLP). A partir de régions de formes et de tailles quelconques, nous créons des masques à appliquer aux images avant de les classifier. Ces masques éliminent les zones de visages non pertinentes pour le processus de classification, en augmentant ainsi la performance du système. Nous avons conduit des expériences sur les bases d'images FERET, GENKI et JAFFE. Les résultats montrent une augmentation du taux de classification en utilisant les masques désignant les pixels d'intérêt
Vers une Interopérabilité Multi-Niveaux des Métadonnées
National audienceSeveral matching techniques have been proposed in recent decades to achieve meta-data interoperability. However, most of the techniques focus on matching in the lowest level (schema level) taking into account only schemas defined using one description language which partially solve the problem of heterogeneity. In this context, we propose in this paper, a new matching approach, named MuMIe (Multi-level Metadata Integration), it aims at achieving interoperability on both levels (schema and description language). The proposed technique transforms schemas from different description languages in directed labeled graphs capturing only the basic concepts. A methodology for matching is then performed in the lower level to find the mapping between graph nodes, this is done by using several structural and semantic information. Our experimental results demonstrate the performances the proposed system.Plusieurs techniques de matching ont été proposées ces dernières décennies permet-tant la réalisation de l'interopérabilité des métadonnées. Toutefois, la plupart de ces techniques se concentrent sur le matching au bas niveau (niveau schéma) en prenant en compte les schémas provenant d'un seul langage de description, ce qui résout partiellement le problème d'hétérogé-néité. Dans ce contexte, nous proposons, une nouvelle approche de matching, baptisée MuMIe (Multi-level Metadata Integration). Elle a pour but de réaliser l'interopérabilité sur les deux niveaux (langage et schéma de description). La technique proposée transforme les schémas provenant de différents langages en graphes, en capturant uniquement les concepts basiques. Une méthodologie de matching est ensuite effectuée dans le bas niveau, permettant de trou-ver les correspondances entre les noeuds des graphes, via l'utilisation de plusieurs informations sémantiques et structurelles. Les expérimentations effectuées montrent les performances du sys-tème proposé
- …