3,880 research outputs found
FEAFA: A Well-Annotated Dataset for Facial Expression Analysis and 3D Facial Animation
Facial expression analysis based on machine learning requires large number of
well-annotated data to reflect different changes in facial motion. Publicly
available datasets truly help to accelerate research in this area by providing
a benchmark resource, but all of these datasets, to the best of our knowledge,
are limited to rough annotations for action units, including only their
absence, presence, or a five-level intensity according to the Facial Action
Coding System. To meet the need for videos labeled in great detail, we present
a well-annotated dataset named FEAFA for Facial Expression Analysis and 3D
Facial Animation. One hundred and twenty-two participants, including children,
young adults and elderly people, were recorded in real-world conditions. In
addition, 99,356 frames were manually labeled using Expression Quantitative
Tool developed by us to quantify 9 symmetrical FACS action units, 10
asymmetrical (unilateral) FACS action units, 2 symmetrical FACS action
descriptors and 2 asymmetrical FACS action descriptors, and each action unit or
action descriptor is well-annotated with a floating point number between 0 and
1. To provide a baseline for use in future research, a benchmark for the
regression of action unit values based on Convolutional Neural Networks are
presented. We also demonstrate the potential of our FEAFA dataset for 3D facial
animation. Almost all state-of-the-art algorithms for facial animation are
achieved based on 3D face reconstruction. We hence propose a novel method that
drives virtual characters only based on action unit value regression of the 2D
video frames of source actors.Comment: 9 pages, 7 figure
Facial Asymmetry Analysis Based on 3-D Dynamic Scans
Facial dysfunction is a fundamental symptom which often relates to many neurological illnesses, such as stroke, Bell’s palsy, Parkinson’s disease, etc. The current methods for detecting and assessing facial dysfunctions mainly rely on the trained practitioners which have significant limitations as they are often subjective. This paper presents a computer-based methodology of facial asymmetry analysis which aims for automatically detecting facial dysfunctions. The method is based on dynamic 3-D scans of human faces. The preliminary evaluation results testing on facial sequences from Hi4D-ADSIP database suggest that the proposed method is able to assist in the quantification and diagnosis of facial dysfunctions for neurological patients
Geometry-Aware Face Completion and Editing
Face completion is a challenging generation task because it requires
generating visually pleasing new pixels that are semantically consistent with
the unmasked face region. This paper proposes a geometry-aware Face Completion
and Editing NETwork (FCENet) by systematically studying facial geometry from
the unmasked region. Firstly, a facial geometry estimator is learned to
estimate facial landmark heatmaps and parsing maps from the unmasked face
image. Then, an encoder-decoder structure generator serves to complete a face
image and disentangle its mask areas conditioned on both the masked face image
and the estimated facial geometry images. Besides, since low-rank property
exists in manually labeled masks, a low-rank regularization term is imposed on
the disentangled masks, enforcing our completion network to manage occlusion
area with various shape and size. Furthermore, our network can generate diverse
results from the same masked input by modifying estimated facial geometry,
which provides a flexible mean to edit the completed face appearance. Extensive
experimental results qualitatively and quantitatively demonstrate that our
network is able to generate visually pleasing face completion results and edit
face attributes as well
Mirror, mirror on the wall, tell me, is the error small?
Do object part localization methods produce bilaterally symmetric results on
mirror images? Surprisingly not, even though state of the art methods augment
the training set with mirrored images. In this paper we take a closer look into
this issue. We first introduce the concept of mirrorability as the ability of a
model to produce symmetric results in mirrored images and introduce a
corresponding measure, namely the \textit{mirror error} that is defined as the
difference between the detection result on an image and the mirror of the
detection result on its mirror image. We evaluate the mirrorability of several
state of the art algorithms in two of the most intensively studied problems,
namely human pose estimation and face alignment. Our experiments lead to
several interesting findings: 1) Surprisingly, most of state of the art methods
struggle to preserve the mirror symmetry, despite the fact that they do have
very similar overall performance on the original and mirror images; 2) the low
mirrorability is not caused by training or testing sample bias - all algorithms
are trained on both the original images and their mirrored versions; 3) the
mirror error is strongly correlated to the localization/alignment error (with
correlation coefficients around 0.7). Since the mirror error is calculated
without knowledge of the ground truth, we show two interesting applications -
in the first it is used to guide the selection of difficult samples and in the
second to give feedback in a popular Cascaded Pose Regression method for face
alignment.Comment: 8 pages, 9 figure
FSRNet: End-to-End Learning Face Super-Resolution with Facial Priors
Face Super-Resolution (SR) is a domain-specific super-resolution problem. The
specific facial prior knowledge could be leveraged for better super-resolving
face images. We present a novel deep end-to-end trainable Face Super-Resolution
Network (FSRNet), which makes full use of the geometry prior, i.e., facial
landmark heatmaps and parsing maps, to super-resolve very low-resolution (LR)
face images without well-aligned requirement. Specifically, we first construct
a coarse SR network to recover a coarse high-resolution (HR) image. Then, the
coarse HR image is sent to two branches: a fine SR encoder and a prior
information estimation network, which extracts the image features, and
estimates landmark heatmaps/parsing maps respectively. Both image features and
prior information are sent to a fine SR decoder to recover the HR image. To
further generate realistic faces, we propose the Face Super-Resolution
Generative Adversarial Network (FSRGAN) to incorporate the adversarial loss
into FSRNet. Moreover, we introduce two related tasks, face alignment and
parsing, as the new evaluation metrics for face SR, which address the
inconsistency of classic metrics w.r.t. visual perception. Extensive benchmark
experiments show that FSRNet and FSRGAN significantly outperforms state of the
arts for very LR face SR, both quantitatively and qualitatively. Code will be
made available upon publication.Comment: Chen and Tai contributed equally to this pape
- …