8,356 research outputs found
Distinguishing Posed and Spontaneous Smiles by Facial Dynamics
Smile is one of the key elements in identifying emotions and present state of
mind of an individual. In this work, we propose a cluster of approaches to
classify posed and spontaneous smiles using deep convolutional neural network
(CNN) face features, local phase quantization (LPQ), dense optical flow and
histogram of gradient (HOG). Eulerian Video Magnification (EVM) is used for
micro-expression smile amplification along with three normalization procedures
for distinguishing posed and spontaneous smiles. Although the deep CNN face
model is trained with large number of face images, HOG features outperforms
this model for overall face smile classification task. Using EVM to amplify
micro-expressions did not have a significant impact on classification accuracy,
while the normalizing facial features improved classification accuracy. Unlike
many manual or semi-automatic methodologies, our approach aims to automatically
classify all smiles into either `spontaneous' or `posed' categories, by using
support vector machines (SVM). Experimental results on large UvA-NEMO smile
database show promising results as compared to other relevant methods.Comment: 16 pages, 8 figures, ACCV 2016, Second Workshop on Spontaneous Facial
Behavior Analysi
Less is More: Facial Landmarks can Recognize a Spontaneous Smile
Smile veracity classification is a task of interpreting social interactions.
Broadly, it distinguishes between spontaneous and posed smiles. Previous
approaches used hand-engineered features from facial landmarks or considered
raw smile videos in an end-to-end manner to perform smile classification tasks.
Feature-based methods require intervention from human experts on feature
engineering and heavy pre-processing steps. On the contrary, raw smile video
inputs fed into end-to-end models bring more automation to the process with the
cost of considering many redundant facial features (beyond landmark locations)
that are mainly irrelevant to smile veracity classification. It remains unclear
to establish discriminative features from landmarks in an end-to-end manner. We
present a MeshSmileNet framework, a transformer architecture, to address the
above limitations. To eliminate redundant facial features, our landmarks input
is extracted from Attention Mesh, a pre-trained landmark detector. Again, to
discover discriminative features, we consider the relativity and trajectory of
the landmarks. For the relativity, we aggregate facial landmark that
conceptually formats a curve at each frame to establish local spatial features.
For the trajectory, we estimate the movements of landmark composed features
across time by self-attention mechanism, which captures pairwise dependency on
the trajectory of the same landmark. This idea allows us to achieve
state-of-the-art performances on UVA-NEMO, BBC, MMI Facial Expression, and SPOS
datasets
Automatic analysis of facial actions: a survey
As one of the most comprehensive and objective ways to describe facial expressions, the Facial Action Coding System (FACS) has recently received significant attention. Over the past 30 years, extensive research has been conducted by psychologists and neuroscientists on various aspects of facial expression analysis using FACS. Automating FACS coding would make this research faster and more widely applicable, opening up new avenues to understanding how we communicate through facial expressions. Such an automated process can also potentially increase the reliability, precision and temporal resolution of coding. This paper provides a comprehensive survey of research into machine analysis of facial actions. We systematically review all components of such systems: pre-processing, feature extraction and machine coding of facial actions. In addition, the existing FACS-coded facial expression databases are summarised. Finally, challenges that have to be addressed to make automatic facial action analysis applicable in real-life situations are extensively discussed. There are two underlying motivations for us to write this survey paper: the first is to provide an up-to-date review of the existing literature, and the second is to offer some insights into the future of machine recognition of facial actions: what are the challenges and opportunities that researchers in the field face
Towards spatial and temporal analysis of facial expressions in 3D data
Facial expressions are one of the most important means for communication of emotions and meaning. They are used to clarify and give emphasis, to express intentions, and form a crucial part of any human interaction. The ability to automatically recognise and analyse expressions could therefore prove to be vital in human behaviour understanding, which has applications in a number of areas such as psychology, medicine and security.
3D and 4D (3D+time) facial expression analysis is an expanding field, providing the ability to deal with problems inherent to 2D images, such as out-of-plane motion, head pose, and lighting and illumination issues. Analysis of data of this kind requires extending successful approaches applied to the 2D problem, as well as the development of new techniques. The introduction of recent new databases containing appropriate expression data, recorded in 3D or 4D, has allowed research
into this exciting area for the first time.
This thesis develops a number of techniques, both in 2D and 3D, that build towards a complete system for analysis of 4D expressions. Suitable feature types, designed by employing binary pattern methods, are developed for analysis of 3D facial geometry data. The full dynamics of 4D expressions are modelled, through a system reliant on motion-based features, to demonstrate how the different components of the expression (neutral-onset-apex-offset) can be distinguished and harnessed. Further, the spatial structure of expressions is harnessed to improve expression component intensity estimation in 2D videos. Finally, it is discussed how this latter step could be extended to 3D facial expression analysis, and also combined with temporal analysis. Thus, it is demonstrated that both spatial and temporal information, when combined with appropriate 3D features, is critical in analysis of 4D expression data.Open Acces
Differentiating Between Spontaneous and Posed Facial Expression using Inception V4
Master's thesis Information- and communication technology IKT590 - University of Agder 2018This thesis proposes a way to simplify and make solutions for spontaneous and
posed facial expression analysis more efficient. Traditional approaches have been
using hand-crafted features and two image frames to be able to differentiate between
spontaneous and posed facial expressions. The solution aims to be as flexible
as possible and introduces two models to differentiate between posed and
spontaneous facial expression.
We introduce Inception V4 as an algorithm to solve this task. The results
indicate that Inception V4 may be too deep and unable to differentiate between
spontaneous and posed facial expression accurately. A shallow CNN model is
also introduced. The shallow CNN model performs better than the Inception V4
model. None of the two come close to the state-of-the-art results. This may
indicate that to differentiate between spontaneous and posed facial expressions
the difference between the onset and apex frame of an expression is needed as
input. This thesis, also suggests an alternative algorithm based on our findings.
For further work, an algorithm which is not as deep as Inception V4 is needed.
However, by using parts of the Inception V4 architecture, we may be able to
capture facial features better.
The task of differentiating between spontaneous emotion and posed emotion
has also been investigated; however, the results do not show great promise. The
task does not have any state-of-the-art results to compare our approach with. Our
models, although lacking in performance, does seem able to capture relevant facial
features from the dataset
Facial expression recognition in dynamic sequences: An integrated approach
Automatic facial expression analysis aims to analyse human facial expressions and classify them into discrete categories. Methods based on existing
work are reliant on extracting information from video sequences and employ either some form of subjective thresholding of dynamic information or
attempt to identify the particular individual frames in which the expected
behaviour occurs. These methods are inefficient as they require either additional subjective information, tedious manual work or fail to take advantage
of the information contained in the dynamic signature from facial movements
for the task of expression recognition.
In this paper, a novel framework is proposed for automatic facial expression analysis which extracts salient information from video sequences
but does not rely on any subjective preprocessing or additional user-supplied
information to select frames with peak expressions. The experimental framework demonstrates that the proposed method outperforms static expression
recognition systems in terms of recognition rate. The approach does not rely on action units (AUs) and therefore, eliminates errors which are otherwise
propagated to the final result due to incorrect initial identification of AUs.
The proposed framework explores a parametric space of over 300 dimensions
and is tested with six state-of-the-art machine learning techniques. Such
robust and extensive experimentation provides an important foundation for
the assessment of the performance for future work. A further contribution
of the paper is offered in the form of a user study. This was conducted in
order to investigate the correlation between human cognitive systems and the
proposed framework for the understanding of human emotion classification
and the reliability of public databases
Short and long range relation based spatio-temporal transformer for micro-expression recognition
The authors would like to thank the China Scholarship Council – University of St Andrews Scholarships (No.201908060250) funds L. Zhang for her PhD. This work is funded by the National Key Research and Development Project of China under Grant No. 2019YFB1312000, the National Natural Science Foundation of China under Grant No. 62076195, and the Fundamental Research Funds for the Central Universities under Grant No. AUGA5710011522.Being spontaneous, micro-expressions are useful in the inference of a person's true emotions even if an attempt is made to conceal them. Due to their short duration and low intensity, the recognition of micro-expressions is a difficult task in affective computing. The early work based on handcrafted spatio-temporal features which showed some promise, has recently been superseded by different deep learning approaches which now compete for the state of the art performance. Nevertheless, the problem of capturing both local and global spatio-temporal patterns remains challenging. To this end, herein we propose a novel spatio-temporal transformer architecture – to the best of our knowledge, the first purely transformer based approach (i.e. void of any convolutional network use) for micro-expression recognition. The architecture comprises a spatial encoder which learns spatial patterns, a temporal aggregator for temporal dimension analysis, and a classification head. A comprehensive evaluation on three widely used spontaneous micro-expression data sets, namely SMIC-HS, CASME II and SAMM, shows that the proposed approach consistently outperforms the state of the art, and is the first framework in the published literature on micro-expression recognition to achieve the unweighted F1-score greater than 0.9 on any of the aforementioned data sets.PostprintPostprintPeer reviewe
UrbanDiary - a tracking project
This working paper investigates aspects of time in an urban environment, specifically the cycles and routines of everyday life in the city. As part of the UrbanDiary project (urbantick.blogspot.com), we explore a preliminary study to trace citizen’s spatial habits in individual movement utilising GPS devices with the aim of capturing the beat and rhythm of the city. The data collected includes time and location, to visualise individual activity, along with a series of personal statements on how individuals “use” and experience the city. In this paper, the intent is to explore the context of the UrbanDiary project as well as examine the methodology and technical aspects of tracking with a focus on the comparison of different visualisation techniques. We conclude with a visualisation of the collected data, specifically where the aspect of time is developed and explored so that we might outline a new approach to visualising the city in the sense of a collective, constantly renewed space
- …