83 research outputs found
Copula Ordinal Regression for Joint Estimation of Facial Action Unit Intensity
Joint modeling of the intensity of facial action units (AUs) from face images is challenging due to the large number of AUs (30+) and their intensity levels (6). This is in part due to the lack of suitable models that can efficiently handle such a large number of outputs/classes simultaneously, but also due to the lack of labelled target data. For this reason, majority of the methods proposed so far resort to independent classifiers for the AU intensity. This is suboptimal for at least two reasons: the facial appearance of some AUs changes depending on the intensity of other AUs, and some AUs co-occur more often than others. Encoding this is expected to improve the estimation of target AU intensities, especially in the case of noisy image features, head-pose variations and imbalanced training data. To this end, we introduce a novel modeling framework, Copula Ordinal Regression (COR), that leverages the power of copula functions and CRFs, to detangle the probabilistic modeling of AU dependencies from the marginal modeling of the AU intensity. Consequently, the COR model achieves the joint learning and inference of intensities of multiple AUs, while being computationally tractable. We show on two challenging datasets of naturalistic facial expressions that the proposed approach consistently outperforms (i) independent modeling of AU intensities, and (ii) the state-ofthe-art approach for the target task
DeepCoder: Semi-parametric Variational Autoencoders for Automatic Facial Action Coding
Human face exhibits an inherent hierarchy in its representations (i.e.,
holistic facial expressions can be encoded via a set of facial action units
(AUs) and their intensity). Variational (deep) auto-encoders (VAE) have shown
great results in unsupervised extraction of hierarchical latent representations
from large amounts of image data, while being robust to noise and other
undesired artifacts. Potentially, this makes VAEs a suitable approach for
learning facial features for AU intensity estimation. Yet, most existing
VAE-based methods apply classifiers learned separately from the encoded
features. By contrast, the non-parametric (probabilistic) approaches, such as
Gaussian Processes (GPs), typically outperform their parametric counterparts,
but cannot deal easily with large amounts of data. To this end, we propose a
novel VAE semi-parametric modeling framework, named DeepCoder, which combines
the modeling power of parametric (convolutional) and nonparametric (ordinal
GPs) VAEs, for joint learning of (1) latent representations at multiple levels
in a task hierarchy1, and (2) classification of multiple ordinal outputs. We
show on benchmark datasets for AU intensity estimation that the proposed
DeepCoder outperforms the state-of-the-art approaches, and related VAEs and
deep learning models.Comment: ICCV 2017 - accepte
FEAFA: A Well-Annotated Dataset for Facial Expression Analysis and 3D Facial Animation
Facial expression analysis based on machine learning requires large number of
well-annotated data to reflect different changes in facial motion. Publicly
available datasets truly help to accelerate research in this area by providing
a benchmark resource, but all of these datasets, to the best of our knowledge,
are limited to rough annotations for action units, including only their
absence, presence, or a five-level intensity according to the Facial Action
Coding System. To meet the need for videos labeled in great detail, we present
a well-annotated dataset named FEAFA for Facial Expression Analysis and 3D
Facial Animation. One hundred and twenty-two participants, including children,
young adults and elderly people, were recorded in real-world conditions. In
addition, 99,356 frames were manually labeled using Expression Quantitative
Tool developed by us to quantify 9 symmetrical FACS action units, 10
asymmetrical (unilateral) FACS action units, 2 symmetrical FACS action
descriptors and 2 asymmetrical FACS action descriptors, and each action unit or
action descriptor is well-annotated with a floating point number between 0 and
1. To provide a baseline for use in future research, a benchmark for the
regression of action unit values based on Convolutional Neural Networks are
presented. We also demonstrate the potential of our FEAFA dataset for 3D facial
animation. Almost all state-of-the-art algorithms for facial animation are
achieved based on 3D face reconstruction. We hence propose a novel method that
drives virtual characters only based on action unit value regression of the 2D
video frames of source actors.Comment: 9 pages, 7 figure
Deep learning model for detection of pain intensity from facial expression
Many people who are suffering from a chronic pain face pe- riods of acute pain and resulting problems during their illness and ade- quate reporting of symptoms is necessary for treatment. Some patients have difficulties in adequately alerting caregivers to their pain or describ- ing the intensity which can impact on effective treatment. Pain and its intensity can be noticeable in ones face. Movements in facial muscles can depict ones current emotional state. Machine learning algorithms can detect pain intensity from facial expressions. The algorithm can ex- tract and classify facial expression of pain among patients. In this paper, we propose a new deep learning model for detection of pain intensity from facial expressions. This automatic pain detection system may help clinicians to detect pain and its intensity in patients and by doing this healthcare organizations may have access to more complete and more regular information of patients regarding their pain
Unsupervised Learning Facial Parameter Regressor for Action Unit Intensity Estimation via Differentiable Renderer
Facial action unit (AU) intensity is an index to describe all visually
discernible facial movements. Most existing methods learn intensity estimator
with limited AU data, while they lack generalization ability out of the
dataset. In this paper, we present a framework to predict the facial parameters
(including identity parameters and AU parameters) based on a bone-driven face
model (BDFM) under different views. The proposed framework consists of a
feature extractor, a generator, and a facial parameter regressor. The regressor
can fit the physical meaning parameters of the BDFM from a single face image
with the help of the generator, which maps the facial parameters to the
game-face images as a differentiable renderer. Besides, identity loss, loopback
loss, and adversarial loss can improve the regressive results. Quantitative
evaluations are performed on two public databases BP4D and DISFA, which
demonstrates that the proposed method can achieve comparable or better
performance than the state-of-the-art methods. What's more, the qualitative
results also demonstrate the validity of our method in the wild
- …