9,167 research outputs found

    Time-delay neural network for continuous emotional dimension prediction from facial expression sequences

    Get PDF
    "(c) 2015 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works."Automatic continuous affective state prediction from naturalistic facial expression is a very challenging research topic but very important in human-computer interaction. One of the main challenges is modeling the dynamics that characterize naturalistic expressions. In this paper, a novel two-stage automatic system is proposed to continuously predict affective dimension values from facial expression videos. In the first stage, traditional regression methods are used to classify each individual video frame, while in the second stage, a Time-Delay Neural Network (TDNN) is proposed to model the temporal relationships between consecutive predictions. The two-stage approach separates the emotional state dynamics modeling from an individual emotional state prediction step based on input features. In doing so, the temporal information used by the TDNN is not biased by the high variability between features of consecutive frames and allows the network to more easily exploit the slow changing dynamics between emotional states. The system was fully tested and evaluated on three different facial expression video datasets. Our experimental results demonstrate that the use of a two-stage approach combined with the TDNN to take into account previously classified frames significantly improves the overall performance of continuous emotional state estimation in naturalistic facial expressions. The proposed approach has won the affect recognition sub-challenge of the third international Audio/Visual Emotion Recognition Challenge (AVEC2013)1

    Big data analytics:Computational intelligence techniques and application areas

    Get PDF
    Big Data has significant impact in developing functional smart cities and supporting modern societies. In this paper, we investigate the importance of Big Data in modern life and economy, and discuss challenges arising from Big Data utilization. Different computational intelligence techniques have been considered as tools for Big Data analytics. We also explore the powerful combination of Big Data and Computational Intelligence (CI) and identify a number of areas, where novel applications in real world smart city problems can be developed by utilizing these powerful tools and techniques. We present a case study for intelligent transportation in the context of a smart city, and a novel data modelling methodology based on a biologically inspired universal generative modelling approach called Hierarchical Spatial-Temporal State Machine (HSTSM). We further discuss various implications of policy, protection, valuation and commercialization related to Big Data, its applications and deployment

    An original framework for understanding human actions and body language by using deep neural networks

    Get PDF
    The evolution of both fields of Computer Vision (CV) and Artificial Neural Networks (ANNs) has allowed the development of efficient automatic systems for the analysis of people's behaviour. By studying hand movements it is possible to recognize gestures, often used by people to communicate information in a non-verbal way. These gestures can also be used to control or interact with devices without physically touching them. In particular, sign language and semaphoric hand gestures are the two foremost areas of interest due to their importance in Human-Human Communication (HHC) and Human-Computer Interaction (HCI), respectively. While the processing of body movements play a key role in the action recognition and affective computing fields. The former is essential to understand how people act in an environment, while the latter tries to interpret people's emotions based on their poses and movements; both are essential tasks in many computer vision applications, including event recognition, and video surveillance. In this Ph.D. thesis, an original framework for understanding Actions and body language is presented. The framework is composed of three main modules: in the first one, a Long Short Term Memory Recurrent Neural Networks (LSTM-RNNs) based method for the Recognition of Sign Language and Semaphoric Hand Gestures is proposed; the second module presents a solution based on 2D skeleton and two-branch stacked LSTM-RNNs for action recognition in video sequences; finally, in the last module, a solution for basic non-acted emotion recognition by using 3D skeleton and Deep Neural Networks (DNNs) is provided. The performances of RNN-LSTMs are explored in depth, due to their ability to model the long term contextual information of temporal sequences, making them suitable for analysing body movements. All the modules were tested by using challenging datasets, well known in the state of the art, showing remarkable results compared to the current literature methods

    Post-training discriminative pruning for RBMs

    Get PDF
    One of the major challenges in the area of artificial neural networks is the identification of a suitable architecture for a specific problem. Choosing an unsuitable topology can exponentially increase the training cost, and even hinder network convergence. On the other hand, recent research indicates that larger or deeper nets can map the problem features into a more appropriate space, and thereby improve the classification process, thus leading to an apparent dichotomy. In this regard, it is interesting to inquire whether independent measures, such as mutual information, could provide a clue to finding the most discriminative neurons in a network. In the present work we explore this question in the context of Restricted Boltzmann Machines, by employing different measures to realize post-training pruning. The neurons which are determined by each measure to be the most discriminative, are combined and a classifier is applied to the ensuing network to determine its usefulness. We find that two measures in particular seem to be good indicators of the most discriminative neurons, producing savings of generally more than 50% of the neurons, while maintaining an acceptable error rate. Further, it is borne out that starting with a larger network architecture and then pruning is more advantageous than using a smaller network to begin with. Finally, a quantitative index is introduced which can provide information on choosing a suitable pruned network.Fil: Sánchez Gutiérrez, Máximo. Universidad Autónoma Metropolitana; MéxicoFil: Albornoz, Enrique Marcelo. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; ArgentinaFil: Rufiner, Hugo Leonardo. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; Argentina. Universidad Nacional de Entre Ríos; ArgentinaFil: Close, John Goddard. Universidad Autónoma Metropolitana; Méxic
    • …
    corecore