226,222 research outputs found

    Micro-attention for micro-expression recognition

    Get PDF
    Micro-expression, for its high objectivity in emotion detection, has emerged to be a promising modality in affective computing. Recently, deep learning methods have been successfully introduced into the micro-expression recognition area. Whilst the higher recognition accuracy achieved, substantial challenges in micro-expression recognition remain. The existence of micro expression in small-local areas on face and limited size of available databases still constrain the recognition accuracy on such emotional facial behavior. In this work, to tackle such challenges, we propose a novel attention mechanism called micro-attention cooperating with residual network. Micro-attention enables the network to learn to focus on facial areas of interest covering different action units. Moreover, coping with small datasets, the micro-attention is designed without adding noticeable parameters while a simple yet efficient transfer learning approach is together utilized to alleviate the overfitting risk. With extensive experimental evaluations on three benchmarks (CASMEII, SAMM and SMIC) and post-hoc feature visualizations, we demonstrate the effectiveness of the proposed micro-attention and push the boundary of automatic recognition of micro-expression

    Deciphering True Emotions: Micro-Expression Detection and Recognition using Deep Nets

    Full text link
    University of Technology Sydney. Faculty of Engineering and Information Technology.Micro-expressions are anticipated as the outcome of deliberate manipulation or involuntary repression of emotions when an individual feels emotion but tries to conceal the facial movements. The micro-expression interpretation tends to recognise a person’s deceit and actual mental state. Therefore, micro-expression detection and recognition has significant opportunities for emotion analysis in psychotherapy, forensics, border protection, and negotiations, among others. Since such gestures are quick and hard to spot with the naked eyes, the inclination towards automated micro-expression recognition is an obvious step forward in the domain. Micro-expression research has drawn various interests within the computer vision field notable in localisation, magnification and recognition. Earlier studies primarily implemented single handcraft descriptors and classifiers for recognising micro-expressions. Modern techniques emphasise on deploying Convolutional Neural Networks (CNNs) or hybrid strategies that integrate handcraft descriptors and CNNs. Owing to the existence of a few datasets, the recognition of micro-expressions is still a concern. Nevertheless, efficiency is often influenced by the feature selection and training approach. Our work, presented in this thesis, introduces various approaches that we have developed to detect and recognise facial micro-expressions using deep networks. In the initial stages of this work we design a dual-stream model with attention networks for the task of micro-expression detection from images. We implement Local- and Global-level Attention Networks (LGAttNet) to concentrate on local facial regions as well as full face to boost the chances of extracting relevant micro-expression features. Unlike previous detection methods where frame difference is calculated to detect micro-expressions, our framework uses attention network to focus on various parts of a face to identify the presence of the micro-expression. We developed LGAttNet to be a supervised detection framework where a traditional Artificial Neural Network (ANN) is trained as a binary classifier. LGAttNet is a novel documented approach that utilises attention network for micro-expression detection from image and video frame sequence. The next stage of this thesis focuses on recognising micro-expression from an image using CNN. We propose to implement a CNN network by performing fine-tuning on a pre-trained CNN network. Fine-tuning is carried out to retrain the last convolutional layer of the CNN network to be able to learn appropriate micro-expression features and predict the micro-expression classes accurately. This fine-tuned CNN network gained acceptable accuracy for recognising micro-expressions from image frames. Thirdly, we extend the outcome of this stage to be implemented on video data; hence we explore the approach of combining handcrafted descriptors with the CNN derived features. Local Binary Pattern-Three Orthogonal Planes (LBP-TOP) and VGGFace CNN network are combined in late fusion technique to extract a comprehensive feature representation of the video. Softmax and SVM are trained for classification. The employed hybrid approach is one of the first attempts to implement handcrafted descriptors and deep features for micro-expression recognition. Finally, we consider the factor of gender affecting the tendency to express micro-expressions. We have built a multi-task learning architecture with two streams extracting different features to achieve the same task of micro-expression recognition based on gender, GEME. We incorporated a dynamic image concept to convert a video into a single frame, and gender features and micro-expression features are added at each level and given to the micro-expression stream. Inclusion of the gender features with the micro-expression features elevates the feature details respective to the individual participant, and the network learns unique gender features while extracting micro-expression features. Concisely, we have introduced four novel concepts for micro-expression detection and recognition. The work described in this thesis establishes a connection between computer vision and psychotherapy, and aids to expedite the micro-expression analysis process for quick assessment wherever necessary

    Multi-scale fusion visual attention network for facial micro-expression recognition

    Get PDF
    IntroductionMicro-expressions are facial muscle movements that hide genuine emotions. In response to the challenge of micro-expression low-intensity, recent studies have attempted to locate localized areas of facial muscle movement. However, this ignores the feature redundancy caused by the inaccurate locating of the regions of interest.MethodsThis paper proposes a novel multi-scale fusion visual attention network (MFVAN), which learns multi-scale local attention weights to mask regions of redundancy features. Specifically, this model extracts the multi-scale features of the apex frame in the micro-expression video clips by convolutional neural networks. The attention mechanism focuses on the weights of local region features in the multi-scale feature maps. Then, we mask operate redundancy regions in multi-scale features and fuse local features with high attention weights for micro-expression recognition. The self-supervision and transfer learning reduce the influence of individual identity attributes and increase the robustness of multi-scale feature maps. Finally, the multi-scale classification loss, mask loss, and removing individual identity attributes loss joint to optimize the model.ResultsThe proposed MFVAN method is evaluated on SMIC, CASME II, SAMM, and 3DB-Combined datasets that achieve state-of-the-art performance. The experimental results show that focusing on local at the multi-scale contributes to micro-expression recognition.DiscussionThis paper proposed MFVAN model is the first to combine image generation with visual attention mechanisms to solve the combination challenge problem of individual identity attribute interference and low-intensity facial muscle movements. Meanwhile, the MFVAN model reveal the impact of individual attributes on the localization of local ROIs. The experimental results show that a multi-scale fusion visual attention network contributes to micro-expression recognition

    Improving Micro-Expression Recognition with Shift Matrices and Database Combination

    Get PDF
    Micro-expressions are brief, subtle changes in facial expressions associated with emotional responses, and researchers have worked for decades on automatic recognition of them. As convolutional neural networks have been widely used in many areas of computer vision, such as image recognition and motion detection, it has also drawn the attention of scientists to use it for micro-expression recognition. However, none of them have been able to achieve an accuracy high enough for practical use. One of the biggest problems is the limited number of available datasets. The most popular datasets are SMIC, CASME, CASMEII, and SAMM. Most groups have worked on the datasets separately, but few have tried to combine them. In our approach, we combined the datasets and extracted the shared features. If new datasets under the same classifying rules (FACS) are created in the future, they can easily be combined using our approach. In addition to this novel approach for combining datasets, we use a new way of extracting the features instead of the Local Binary Pattern from Three Orthogonal Planes (LBP-TOP). To be more specific, we create shift matrices, the changing pattern of pixels, to keep the spatial information of the videos. Our highest recorded accuracy from 100 experiments was 88 percent, but we chose to report 72.5 percent. This is the median accuracy and a more convincing result though it’s a little bit lower than the best result to date. However, our f1 score is 72.3 percent and higher than the best result to date. Our paper presents an extendable approach to micro-expression recognition that should increase in accuracy as more datasets become available
    • …
    corecore