12 research outputs found

    Head-Pose Invariant Facial Expression Recognition using Convolutional Neural Networks

    Get PDF
    Automatic face analysis has to cope with pose and lighting variations. Especially pose variations are difficult to tackle and many face analysis methods require the use of sophisticated normalization and initialization procedures. We propose a data-driven face analysis approach that is not only capable of extracting features relevant to a given face analysis task, but is also more robust with regard to face location changes and scale variations when compared to classical methods such as e.g. MLPs. Our approach is based on convolutional neural networks that use multi-scale feature extractors, which allow for improved facial expression recognition results with faces subject to in-plane pose variations

    Baseline CNN structure analysis for facial expression recognition

    Full text link
    We present a baseline convolutional neural network (CNN) structure and image preprocessing methodology to improve facial expression recognition algorithm using CNN. To analyze the most efficient network structure, we investigated four network structures that are known to show good performance in facial expression recognition. Moreover, we also investigated the effect of input image preprocessing methods. Five types of data input (raw, histogram equalization, isotropic smoothing, diffusion-based normalization, difference of Gaussian) were tested, and the accuracy was compared. We trained 20 different CNN models (4 networks x 5 data input types) and verified the performance of each network with test images from five different databases. The experiment result showed that a three-layer structure consisting of a simple convolutional and a max pooling layer with histogram equalization image input was the most efficient. We describe the detailed training procedure and analyze the result of the test accuracy based on considerable observation.Comment: 6 pages, RO-MAN2016 Conferenc

    Optimizing Filter Size in Convolutional Neural Networks for Facial Action Unit Recognition

    Full text link
    Recognizing facial action units (AUs) during spontaneous facial displays is a challenging problem. Most recently, Convolutional Neural Networks (CNNs) have shown promise for facial AU recognition, where predefined and fixed convolution filter sizes are employed. In order to achieve the best performance, the optimal filter size is often empirically found by conducting extensive experimental validation. Such a training process suffers from expensive training cost, especially as the network becomes deeper. This paper proposes a novel Optimized Filter Size CNN (OFS-CNN), where the filter sizes and weights of all convolutional layers are learned simultaneously from the training data along with learning convolution filters. Specifically, the filter size is defined as a continuous variable, which is optimized by minimizing the training loss. Experimental results on two AU-coded spontaneous databases have shown that the proposed OFS-CNN is capable of estimating optimal filter size for varying image resolution and outperforms traditional CNNs with the best filter size obtained by exhaustive search. The OFS-CNN also beats the CNN using multiple filter sizes and more importantly, is much more efficient during testing with the proposed forward-backward propagation algorithm

    Analysis on techniques used to recognize and identifying the Human emotions

    Get PDF
    Facial expression is a major area for non-verbal language in day to day life communication. As the statistical analysis shows only 7 percent of the message in communication was covered in verbal communication while 55 percent transmitted by facial expression. Emotional expression has been a research subject of physiology since Darwin’s work on emotional expression in the 19th century. According to Psychological theory the classification of human emotion is classified majorly into six emotions: happiness, fear, anger, surprise, disgust, and sadness. Facial expressions which involve the emotions and the nature of speech play a foremost role in expressing these emotions. Thereafter, researchers developed a system based on Anatomic of face named Facial Action Coding System (FACS) in 1970. Ever since the development of FACS there is a rapid progress of research in the domain of emotion recognition. This work is intended to give a thorough comparative analysis of the various techniques and methods that were applied to recognize and identify human emotions. This analysis results will help to identify the proper and suitable techniques, algorithms and the methodologies for future research directions. In this paper extensive analysis on the various recognition techniques used to identify the complexity in recognizing the facial expression is presented. This work will also help researchers and scholars to ease out the problem in choosing the techniques used in the identification of the facial expression domain

    Time-Efficient Hybrid Approach for Facial Expression Recognition

    Get PDF
    Facial expression recognition is an emerging research area for improving human and computer interaction. This research plays a significant role in the field of social communication, commercial enterprise, law enforcement, and other computer interactions. In this paper, we propose a time-efficient hybrid design for facial expression recognition, combining image pre-processing steps and different Convolutional Neural Network (CNN) structures providing better accuracy and greatly improved training time. We are predicting seven basic emotions of human faces: sadness, happiness, disgust, anger, fear, surprise and neutral. The model performs well regarding challenging facial expression recognition where the emotion expressed could be one of several due to their quite similar facial characteristics such as anger, disgust, and sadness. The experiment to test the model was conducted across multiple databases and different facial orientations, and to the best of our knowledge, the model provided an accuracy of about 89.58% for KDEF dataset, 100% accuracy for JAFFE dataset and 71.975% accuracy for combined (KDEF + JAFFE + SFEW) dataset across these different scenarios. Performance evaluation was done by cross-validation techniques to avoid bias towards a specific set of images from a database

    Identifying facial landmarks, action units and emotions using deep networks

    Get PDF
    The goal of this thesis it to use deep neural networks, specifically Convolutional Neural Networks (CNNs) to predict facial landmarks, facial action units and emotions and to study the results of intermediate experiments while doing so. Learning the different features of facial images has always been a difficult task and primarily involves using hand-crafted features which would almost definitely ignore some information related to the different dynamics of facial features. We train our network model using the raw facial images and study its effectiveness in predicting facial landmarks, action units and emotions. In this thesis we learnt that CNNs are highly effective in predicting facial landmarks and AUs, mainly because of their ability to learn features from raw images. We also established that feature sets which can effectively outline the different properties of a face are more useful in classifying facial emotions than either images or facial landmarks

    Improving Facial Action Unit Recognition Using Convolutional Neural Networks

    Get PDF
    Recognizing facial action units (AUs) from spontaneous facial expression is a challenging problem, because of subtle facial appearance changes, free head movements, occlusions, and limited AU-coded training data. Most recently, convolutional neural networks (CNNs) have shown promise on facial AU recognition. However, CNNs are often overfitted and do not generalize well to unseen subject due to limited AU-coded training images. In order to improve the performance of facial AU recognition, we developed two novel CNN frameworks, by substituting the traditional decision layer and convolutional layer with the incremental boosting layer and adaptive convolutional layer respectively, to recognize the AUs from static image. First, in order to handle the limited AU-coded training data and reduce the overfitting, we proposed a novel Incremental Boosting CNN (IB-CNN) to integrate boosting into the CNN via an incremental boosting layer that selects discriminative neurons from the lower layer and is incrementally updated on successive mini-batches. In addition, a novel loss function that accounts for errors from both the incremental boosted classifier and individual weak classifiers was proposed to fine-tune the IBCNN. Experimental results on four benchmark AU databases have demonstrated that the IB-CNN yields significant improvement over the traditional CNN and the boosting CNN without incremental learning, as well as outperforming the state-of-the-art CNN-based methods in AU recognition. The improvement is more impressive for the AUs that have the lowest frequencies in the databases. Second, all current CNNs use predefined and fixed convolutional filter size. However, AUs activated by different facial muscles cause facial appearance changes at different scales and thus favor different filter sizes. The traditional strategy is to experimentally select the best filter size for each AU in each convolutional layer, but it suffers from expensive training cost, especially when the networks become deeper and deeper. We proposed a novel Optimized Filter Size CNN (OFS-CNN), where the filter sizes and weights of all convolutional layers are learned simultaneously from the training data along with learning convolutional filters. Specifically, the filter size is defined as a continuous variable, which is optimized by minimizing the training loss. Experimental results on four AU-coded databases and one spontaneous facial expression database outperforms traditional CNNs with fixed filter sizes and achieves state-of-the-art recognition performance. Furthermore, the OFS-CNN also beats traditional CNNs using the best filter size obtained by exhaustive search and is capable of estimating optimal filter size for varying image resolution

    Activity Report 2002

    Get PDF

    Activity Report 2003

    Get PDF
    corecore