51,341 research outputs found
Modelling, Classification and Synthesis of Facial Expressions
The field of computer vision endeavours to develop automatic approaches to the
interpretation of images from the real world. Over the past number of decades researchers
within this field have created systems specifically for the automatic analysis of facial
expression. The most successful of these approaches draw on the tools from behavioural
science. In this chapter we examine facial expression analysis from both a behavioural
science and a computer vision perspective. First we will provide details of the principal
approach used in behavioural science to analyze facial expressions. This will include an
overview of the evolution of facial expression analysis, where we introduce the field of facial
expression analysis with Darwin’s initial findings (Darwin, 1872). We then go on to show
how his findings were confirmed nearly 100 years later by Ekman et al. (Ekman et al., 1969).
Following on from this we provide details of recent works investigating the appearance and
dynamics of facial expressions
Emotion recognition in simulated social interactions
Social context plays an important role in everyday emotional interactions, and others' faces often provide contextual cues in social situations. Investigating this complex social process is a challenge that can be addressed with the use of computergenerated facial expressions. In the current research, we use synthesized facial expressions to investigate the influence of socioaffective inferential mechanisms on the recognition of social emotions. Participants judged blends of facial expressions of shame-sadness, or of anger-disgust, in a target avatar face presented at the center of a screen while a contextual avatar face expressed an emotion (disgust, contempt, sadness) or remained neutral. The dynamics of the facial expressions and the head/gaze movements of the two avatars were manipulated in order to create an interaction in which the two avatars shared eye gaze only in the social interaction condition. Results of Experiment 1 revealed that when the avatars engaged in social interaction, target expression blends of shame and sadness were perceived as expressing more shame if the contextual face expressed disgust and more sadness when the contextual face expressed sadness. Interestingly, perceptions of shame were not enhanced when the contextual face expressed contempt. The latter finding is probably attributable to the low recognition rates for the expression of contempt observed in Experiment 2
The influence of dynamics and speech on understanding humanoid facial expressions
Human communication relies mostly on nonverbal signals expressed through body language. Facial expressions, in particular, convey emotional information that allows people involved in social interactions to mutually judge the emotional states and to adjust its behavior appropriately. First studies aimed at investigating the recognition of facial expressions were based on static stimuli. However, facial expressions are rarely static, especially in everyday social interactions. Therefore, it has been hypothesized that the dynamics inherent in a facial expression could be fundamental in understanding its meaning. In addition, it has been demonstrated that nonlinguistic and linguistic information can contribute to reinforce the meaning of a facial expression making it easier to be recognized. Nevertheless, few studies have been performed on realistic humanoid robots. This experimental work aimed at demonstrating the human-like expressive capability of a humanoid robot by examining whether the effect of motion and vocal content influenced the perception of its facial expressions. The first part of the experiment aimed at studying the recognition capability of two kinds of stimuli related to the six basic expressions (i.e. anger, disgust, fear, happiness, sadness, and surprise): static stimuli, that is, photographs, and dynamic stimuli, that is, video recordings. The second and third parts were focused on comparing the same six basic expressions performed by a virtual avatar and by a physical robot under three different conditions: (1) muted facial expressions, (2) facial expressions with nonlinguistic vocalizations, and (3) facial expressions with an emotionally neutral verbal sentence. The results show that static stimuli performed by a human being and by the robot were more ambiguous than the corresponding dynamic stimuli on which motion and vocalization were associated. This hypothesis has been also investigated with a 3-dimensional replica of the physical robot demonstrating that even in case of a virtual avatar, dynamic and vocalization improve the emotional conveying capability
Statistical modelling for facial expression dynamics
PhDOne of the most powerful and fastest means of relaying emotions between humans are facial expressions.
The ability to capture, understand and mimic those emotions and their underlying dynamics
in the synthetic counterpart is a challenging task because of the complexity of human emotions, different
ways of conveying them, non-linearities caused by facial feature and head motion, and the
ever critical eye of the viewer. This thesis sets out to address some of the limitations of existing
techniques by investigating three components of expression modelling and parameterisation framework:
(1) Feature and expression manifold representation, (2) Pose estimation, and (3) Expression
dynamics modelling and their parameterisation for the purpose of driving a synthetic head avatar.
First, we introduce a hierarchical representation based on the Point Distribution Model (PDM).
Holistic representations imply that non-linearities caused by the motion of facial features, and intrafeature
correlations are implicitly embedded and hence have to be accounted for in the resulting
expression space. Also such representations require large training datasets to account for all possible
variations. To address those shortcomings, and to provide a basis for learning more subtle, localised
variations, our representation consists of tree-like structure where a holistic root component is decomposed
into leaves containing the jaw outline, each of the eye and eyebrows and the mouth. Each
of the hierarchical components is modelled according to its intrinsic functionality, rather than the
final, holistic expression label.
Secondly, we introduce a statistical approach for capturing an underlying low-dimension expression
manifold by utilising components of the previously defined hierarchical representation. As
Principal Component Analysis (PCA) based approaches cannot reliably capture variations caused by
large facial feature changes because of its linear nature, the underlying dynamics manifold for each
of the hierarchical components is modelled using a Hierarchical Latent Variable Model (HLVM) approach.
Whilst retaining PCA properties, such a model introduces a probability density model which
can deal with missing or incomplete data and allows discovery of internal within cluster structures.
All of the model parameters and underlying density model are automatically estimated during the
training stage. We investigate the usefulness of such a model to larger and unseen datasets.
Thirdly, we extend the concept of HLVM model to pose estimation to address the non-linear
shape deformations and definition of the plausible pose space caused by large head motion. Since
our head rarely stays still, and its movements are intrinsically connected with the way we perceive
and understand the expressions, pose information is an integral part of their dynamics. The proposed
3
approach integrates into our existing hierarchical representation model. It is learned using sparse and
discreetly sampled training dataset, and generalises to a larger and continuous view-sphere.
Finally, we introduce a framework that models and extracts expression dynamics. In existing
frameworks, explicit definition of expression intensity and pose information, is often overlooked,
although usually implicitly embedded in the underlying representation. We investigate modelling
of the expression dynamics based on use of static information only, and focus on its sufficiency
for the task at hand. We compare a rule-based method that utilises the existing latent structure and
provides a fusion of different components with holistic and Bayesian Network (BN) approaches. An
Active Appearance Model (AAM) based tracker is used to extract relevant information from input
sequences. Such information is subsequently used to define the parametric structure of the underlying
expression dynamics. We demonstrate that such information can be utilised to animate a synthetic
head avatar.
Submitte
Beta event-related desynchronization as an index of individual differences in processing human facial expression: further investigations of autistic traits in typically developing adults
The human mirror neuron system (hMNS) has been associated with various forms of social cognition and affective processing including vicarious experience. It has also been proposed that a faulty hMNS may underlie some of the deficits seen in the autism spectrum disorders (ASDs). In the present study we set out to investigate whether emotional facial expressions could modulate a putative EEG index of hMNS activation (mu suppression) and if so, would this differ according to the individual level of autistic traits [high versus low Autism Spectrum Quotient (AQ) score]. Participants were presented with 3 s films of actors opening and closing their hands (classic hMNS mu-suppression protocol) while simultaneously wearing happy, angry, or neutral expressions. Mu-suppression was measured in the alpha and low beta bands. The low AQ group displayed greater low beta event-related desynchronization (ERD) to both angry and neutral expressions. The high AQ group displayed greater low beta ERD to angry than to happy expressions. There was also significantly more low beta ERD to happy faces for the low than for the high AQ group. In conclusion, an interesting interaction between AQ group and emotional expression revealed that hMNS activation can be modulated by emotional facial expressions and that this is differentiated according to individual differences in the level of autistic traits. The EEG index of hMNS activation (mu suppression) seems to be a sensitive measure of the variability in facial processing in typically developing individuals with high and low self-reported traits of autism
Investigating facial animation production through artistic inquiry
Studies into dynamic facial expressions tend to make use of experimental methods based on objectively manipulated stimuli. New techniques for displaying increasingly realistic facial movement and methods of measuring observer responses are typical of computer animation and psychology facial expression research. However, few projects focus on the artistic nature of performance production. Instead, most concentrate on the naturalistic appearance of posed or acted expressions. In this paper, the authors discuss a method for exploring the creative process of emotional facial expression animation, and ask whether anything can be learned about authentic dynamic expressions through artistic inquiry
Enriched Long-term Recurrent Convolutional Network for Facial Micro-Expression Recognition
Facial micro-expression (ME) recognition has posed a huge challenge to
researchers for its subtlety in motion and limited databases. Recently,
handcrafted techniques have achieved superior performance in micro-expression
recognition but at the cost of domain specificity and cumbersome parametric
tunings. In this paper, we propose an Enriched Long-term Recurrent
Convolutional Network (ELRCN) that first encodes each micro-expression frame
into a feature vector through CNN module(s), then predicts the micro-expression
by passing the feature vector through a Long Short-term Memory (LSTM) module.
The framework contains two different network variants: (1) Channel-wise
stacking of input data for spatial enrichment, (2) Feature-wise stacking of
features for temporal enrichment. We demonstrate that the proposed approach is
able to achieve reasonably good performance, without data augmentation. In
addition, we also present ablation studies conducted on the framework and
visualizations of what CNN "sees" when predicting the micro-expression classes.Comment: Published in Micro-Expression Grand Challenge 2018, Workshop of 13th
IEEE Facial & Gesture 201
Every Smile is Unique: Landmark-Guided Diverse Smile Generation
Each smile is unique: one person surely smiles in different ways (e.g.,
closing/opening the eyes or mouth). Given one input image of a neutral face,
can we generate multiple smile videos with distinctive characteristics? To
tackle this one-to-many video generation problem, we propose a novel deep
learning architecture named Conditional Multi-Mode Network (CMM-Net). To better
encode the dynamics of facial expressions, CMM-Net explicitly exploits facial
landmarks for generating smile sequences. Specifically, a variational
auto-encoder is used to learn a facial landmark embedding. This single
embedding is then exploited by a conditional recurrent network which generates
a landmark embedding sequence conditioned on a specific expression (e.g.,
spontaneous smile). Next, the generated landmark embeddings are fed into a
multi-mode recurrent landmark generator, producing a set of landmark sequences
still associated to the given smile class but clearly distinct from each other.
Finally, these landmark sequences are translated into face videos. Our
experimental results demonstrate the effectiveness of our CMM-Net in generating
realistic videos of multiple smile expressions.Comment: Accepted as a poster in Conference on Computer Vision and Pattern
Recognition (CVPR), 201
The influence of angry customer outbursts on service providers’ facial displays and affective states
This article explores the existence and extent of emotional
contagion, as measured by facial displays and
reported affective states, in a service failure event. Using
video vignettes of customers complaining about a service
failure as stimulus material, the authors measured the
facial displays and affective states of service providers as
proxies for emotional contagion. Following a two-step
approach, service providers’ facial expressions were first
recorded and assessed, revealing that service providers’
facial displays matched those of the angry consumer.
Second, a mixed ANOVA revealed service providers
reported stronger negative affective states after exposure
to an angry complaint than prior to exposure. The results
demonstrated that during a complaint situation, angry
outbursts by consumers can initiate the emotional contagion
process, and service providers are susceptible to
“catch” consumer anger through emotional contagion.
Implications for complaint management and future
research are discussed
- …