12 research outputs found

    Facial micro-expression recognition with noisy labels

    Get PDF
    Abstract. Facial micro-expressions are quick, involuntary and low intensity facial movements. An interest in detecting and recognizing micro-expressions arises from the fact that they are able to show person’s genuine hidden emotions. The small and rapid facial muscle movements are often too difficult for a human to not only spot the occurring micro-expression but also be able to recognize the emotion correctly. Recently, a focus on developing better micro-expression recognition methods has been on models and architectures. However, we take a step back and go to the root of task, the data. We thoroughly analyze the input data and notice that some of the data is noisy and possibly mislabelled. The authors of the micro-expression datasets have also acknowledged the possible problems in data labelling. Despite this, no attempts have been made to design models that take into account the potential mislabelled data in micro-expression recognition, to our best knowledge. In this thesis, we explore new methods taking noisy labels into special account in an attempt to solve the problem. We propose a simple, yet efficient label refurbishing method and a data cleaning method for handling noisy labels. We show through both quantitative and qualitative analysis the effectiveness of the methods for detecting noisy samples. The data cleaning method achieves state-of-the-art results reaching an F1-score of 0.77 in the MEGC2019 composite dataset. Further, we analyze and discuss the results in-depth and suggest future works based on our findings.Kasvojen mikroilmeiden tunnistus kohinaisilla luokilla. Tiivistelmä. Kasvojen mikroilmeet ovat nopeita, tahattomia ja pienen intensiteetin omaavia kasvojen liikkeitä. Kiinnostus mikroilmeiden tunnistamisesta johtuu niiden kyvystä paljastaa henkilöiden todelliset piilotetut tunteet. Pienet ja nopeat kasvojen lihasten liikkeet eivät olet pelkästään vaikeita huomata, mutta oikean tunteen tunnistaminen on erittäin vaikeaa. Lähiaikoina mikroilmetunnistusjärjestelmien kehitys on painottunut malleihin ja arkkitehtuureihin. Me kuitenkin otamme askeleen taaksepäin tästä kehitystyylistä ja menemme ongelman juureen eli dataan. Me tarkastamme käytettävän datan huolellisesti ja huomaamme, että osa datasta on kohinaista ja mahdollisesti väärin kategorisoitu. Mikroilmetietokantojen tekijät ovat myös myöntäneet mahdolliset ongelmat datan kategorisoinnissa. Tästä huolimatta meidän parhaan tietomme mukaan mikroilmeiden tunnistukseen ei ole kehitetty malleja, jotka huomioisivat mahdollisesti väärin kategorisoituja näytteitä. Tässä työssä tutkimme uusia malleja ottaen virheellisesti kategorisoidut näytteet erityisesti huomioon. Ehdotamme yksinkertaista, mutta tehokasta oikaisu menetelmää ja datan puhdistus menetelmää kohinaisia luokkia varten. Näytämme sekä kvantiviisisesti että kvalitatiivisesti menetelmien tehokkuuden kohinaisten näytteiden havaitsemisessa. Datan puhdistus menetelmä saavuttaa huippuluokan tuloksen, saaden F1-arvon 0.77 MEGC2019 tietokannassa. Lisäksi analysoimme ja pohdimme tuloksia syvällisesti ja ehdotamme tutkimuksia tulevaisuuteen tuloksistamme

    How to Synthesize a Large-Scale and Trainable Micro-Expression Dataset?

    Full text link
    This paper does not contain technical novelty but introduces our key discoveries in a data generation protocol, a database and insights. We aim to address the lack of large-scale datasets in micro-expression (MiE) recognition due to the prohibitive cost of data collection, which renders large-scale training less feasible. To this end, we develop a protocol to automatically synthesize large scale MiE training data that allow us to train improved recognition models for real-world test data. Specifically, we discover three types of Action Units (AUs) that can constitute trainable MiEs. These AUs come from real-world MiEs, early frames of macro-expression videos, and the relationship between AUs and expression categories defined by human expert knowledge. With these AUs, our protocol then employs large numbers of face images of various identities and an off-the-shelf face generator for MiE synthesis, yielding the MiE-X dataset. MiE recognition models are trained or pre-trained on MiE-X and evaluated on real-world test sets, where very competitive accuracy is obtained. Experimental results not only validate the effectiveness of the discovered AUs and MiE-X dataset but also reveal some interesting properties of MiEs: they generalize across faces, are close to early-stage macro-expressions, and can be manually defined.Comment: European Conference on Computer Vision 202

    Multi-scale fusion visual attention network for facial micro-expression recognition

    Get PDF
    IntroductionMicro-expressions are facial muscle movements that hide genuine emotions. In response to the challenge of micro-expression low-intensity, recent studies have attempted to locate localized areas of facial muscle movement. However, this ignores the feature redundancy caused by the inaccurate locating of the regions of interest.MethodsThis paper proposes a novel multi-scale fusion visual attention network (MFVAN), which learns multi-scale local attention weights to mask regions of redundancy features. Specifically, this model extracts the multi-scale features of the apex frame in the micro-expression video clips by convolutional neural networks. The attention mechanism focuses on the weights of local region features in the multi-scale feature maps. Then, we mask operate redundancy regions in multi-scale features and fuse local features with high attention weights for micro-expression recognition. The self-supervision and transfer learning reduce the influence of individual identity attributes and increase the robustness of multi-scale feature maps. Finally, the multi-scale classification loss, mask loss, and removing individual identity attributes loss joint to optimize the model.ResultsThe proposed MFVAN method is evaluated on SMIC, CASME II, SAMM, and 3DB-Combined datasets that achieve state-of-the-art performance. The experimental results show that focusing on local at the multi-scale contributes to micro-expression recognition.DiscussionThis paper proposed MFVAN model is the first to combine image generation with visual attention mechanisms to solve the combination challenge problem of individual identity attribute interference and low-intensity facial muscle movements. Meanwhile, the MFVAN model reveal the impact of individual attributes on the localization of local ROIs. The experimental results show that a multi-scale fusion visual attention network contributes to micro-expression recognition

    Automatic inference of latent emotion from spontaneous facial micro-expressions

    Get PDF
    Emotional states exert a profound influence on individuals' overall well-being, impacting them both physically and psychologically. Accurate recognition and comprehension of human emotions represent a crucial area of scientific exploration. Facial expressions, vocal cues, body language, and physiological responses provide valuable insights into an individual's emotional state, with facial expressions being universally recognised as dependable indicators of emotions. This thesis centres around three vital research aspects concerning the automated inference of latent emotions from spontaneous facial micro-expressions, seeking to enhance and refine our understanding of this complex domain. Firstly, the research aims to detect and analyse activated Action Units (AUs) during the occurrence of micro-expressions. AUs correspond to facial muscle movements. Although previous studies have established links between AUs and conventional facial expressions, no such connections have been explored for micro-expressions. Therefore, this thesis develops computer vision techniques to automatically detect activated AUs in micro-expressions, bridging a gap in existing studies. Secondly, the study explores the evolution of micro-expression recognition techniques, ranging from early handcrafted feature-based approaches to modern deep-learning methods. These approaches have significantly contributed to the field of automatic emotion recognition. However, existing methods primarily focus on capturing local spatial relationships, neglecting global relationships between different facial regions. To address this limitation, a novel third-generation architecture is proposed. This architecture can concurrently capture both short and long-range spatiotemporal relationships in micro-expression data, aiming to enhance the accuracy of automatic emotion recognition and improve our understanding of micro-expressions. Lastly, the thesis investigates the integration of multimodal signals to enhance emotion recognition accuracy. Depth information complements conventional RGB data by providing enhanced spatial features for analysis, while the integration of physiological signals with facial micro-expressions improves emotion discrimination. By incorporating multimodal data, the objective is to enhance machines' understanding of latent emotions and improve latent emotion recognition accuracy in spontaneous micro-expression analysis
    corecore