1,029 research outputs found
Self-supervised CNN for Unconstrained 3D Facial Performance Capture from an RGB-D Camera
We present a novel method for real-time 3D facial performance capture with
consumer-level RGB-D sensors. Our capturing system is targeted at robust and
stable 3D face capturing in the wild, in which the RGB-D facial data contain
noise, imperfection and occlusion, and often exhibit high variability in
motion, pose, expression and lighting conditions, thus posing great challenges.
The technical contribution is a self-supervised deep learning framework, which
is trained directly from raw RGB-D data. The key novelties include: (1)
learning both the core tensor and the parameters for refining our parametric
face model; (2) using vertex displacement and UV map for learning surface
detail; (3) designing the loss function by incorporating temporal coherence and
same identity constraints based on pairs of RGB-D images and utilizing sparse
norms, in addition to the conventional terms for photo-consistency, feature
similarity, regularization as well as geometry consistency; and (4) augmenting
the training data set in new ways. The method is demonstrated in a live setup
that runs in real-time on a smartphone and an RGB-D sensor. Extensive
experiments show that our method is robust to severe occlusion, fast motion,
large rotation, exaggerated facial expressions and diverse lighting
On Face Segmentation, Face Swapping, and Face Perception
We show that even when face images are unconstrained and arbitrarily paired,
face swapping between them is actually quite simple. To this end, we make the
following contributions. (a) Instead of tailoring systems for face
segmentation, as others previously proposed, we show that a standard fully
convolutional network (FCN) can achieve remarkably fast and accurate
segmentations, provided that it is trained on a rich enough example set. For
this purpose, we describe novel data collection and generation routines which
provide challenging segmented face examples. (b) We use our segmentations to
enable robust face swapping under unprecedented conditions. (c) Unlike previous
work, our swapping is robust enough to allow for extensive quantitative tests.
To this end, we use the Labeled Faces in the Wild (LFW) benchmark and measure
the effect of intra- and inter-subject face swapping on recognition. We show
that our intra-subject swapped faces remain as recognizable as their sources,
testifying to the effectiveness of our method. In line with well known
perceptual studies, we show that better face swapping produces less
recognizable inter-subject results. This is the first time this effect was
quantitatively demonstrated for machine vision systems
Facial Expressions Tracking and Recognition: Database Protocols for Systems Validation and Evaluation
Each human face is unique. It has its own shape, topology, and distinguishing
features. As such, developing and testing facial tracking systems are
challenging tasks. The existing face recognition and tracking algorithms in
Computer Vision mainly specify concrete situations according to particular
goals and applications, requiring validation methodologies with data that fits
their purposes. However, a database that covers all possible variations of
external and factors does not exist, increasing researchers' work in acquiring
their own data or compiling groups of databases.
To address this shortcoming, we propose a methodology for facial data
acquisition through definition of fundamental variables, such as subject
characteristics, acquisition hardware, and performance parameters. Following
this methodology, we also propose two protocols that allow the capturing of
facial behaviors under uncontrolled and real-life situations. As validation, we
executed both protocols which lead to creation of two sample databases: FdMiee
(Facial database with Multi input, expressions, and environments) and FACIA
(Facial Multimodal database driven by emotional induced acting).
Using different types of hardware, FdMiee captures facial information under
environmental and facial behaviors variations. FACIA is an extension of FdMiee
introducing a pipeline to acquire additional facial behaviors and speech using
an emotion-acting method. Therefore, this work eases the creation of adaptable
database according to algorithm's requirements and applications, leading to
simplified validation and testing processes.Comment: 10 pages, 6 images, Computers & Graphic
Learning Perspective Undistortion of Portraits
Near-range portrait photographs often contain perspective distortion
artifacts that bias human perception and challenge both facial recognition and
reconstruction techniques. We present the first deep learning based approach to
remove such artifacts from unconstrained portraits. In contrast to the previous
state-of-the-art approach, our method handles even portraits with extreme
perspective distortion, as we avoid the inaccurate and error-prone step of
first fitting a 3D face model. Instead, we predict a distortion correction flow
map that encodes a per-pixel displacement that removes distortion artifacts
when applied to the input image. Our method also automatically infers missing
facial features, i.e. occluded ears caused by strong perspective distortion,
with coherent details. We demonstrate that our approach significantly
outperforms the previous state-of-the-art both qualitatively and
quantitatively, particularly for portraits with extreme perspective distortion
or facial expressions. We further show that our technique benefits a number of
fundamental tasks, significantly improving the accuracy of both face
recognition and 3D reconstruction and enables a novel camera calibration
technique from a single portrait. Moreover, we also build the first perspective
portrait database with a large diversity in identities, expression and poses,
which will benefit the related research in this area.Comment: 13 pages, 15 figure
EmotioNet Challenge: Recognition of facial expressions of emotion in the wild
This paper details the methodology and results of the EmotioNet challenge.
This challenge is the first to test the ability of computer vision algorithms
in the automatic analysis of a large number of images of facial expressions of
emotion in the wild. The challenge was divided into two tracks. The first track
tested the ability of current computer vision algorithms in the automatic
detection of action units (AUs). Specifically, we tested the detection of 11
AUs. The second track tested the algorithms' ability to recognize emotion
categories in images of facial expressions. Specifically, we tested the
recognition of 16 basic and compound emotion categories. The results of the
challenge suggest that current computer vision and machine learning algorithms
are unable to reliably solve these two tasks. The limitations of current
algorithms are more apparent when trying to recognize emotion. We also show
that current algorithms are not affected by mild resolution changes, small
occluders, gender or age, but that 3D pose is a major limiting factor on
performance. We provide an in-depth discussion of the points that need special
attention moving forward
How Robust is 3D Human Pose Estimation to Occlusion?
Occlusion is commonplace in realistic human-robot shared environments, yet
its effects are not considered in standard 3D human pose estimation benchmarks.
This leaves the question open: how robust are state-of-the-art 3D pose
estimation methods against partial occlusions? We study several types of
synthetic occlusions over the Human3.6M dataset and find a method with
state-of-the-art benchmark performance to be sensitive even to low amounts of
occlusion. Addressing this issue is key to progress in applications such as
collaborative and service robotics. We take a first step in this direction by
improving occlusion-robustness through training data augmentation with
synthetic occlusions. This also turns out to be an effective regularizer that
is beneficial even for non-occluded test cases.Comment: Accepted for IEEE/RSJ International Conference on Intelligent Robots
and Systems (IROS'18) - Workshop on Robotic Co-workers 4.0: Human Safety and
Comfort in Human-Robot Interactive Social Environment
Occlusion Coherence: Detecting and Localizing Occluded Faces
The presence of occluders significantly impacts object recognition accuracy.
However, occlusion is typically treated as an unstructured source of noise and
explicit models for occluders have lagged behind those for object appearance
and shape. In this paper we describe a hierarchical deformable part model for
face detection and landmark localization that explicitly models part occlusion.
The proposed model structure makes it possible to augment positive training
data with large numbers of synthetically occluded instances. This allows us to
easily incorporate the statistics of occlusion patterns in a discriminatively
trained model. We test the model on several benchmarks for landmark
localization and detection including challenging new data sets featuring
significant occlusion. We find that the addition of an explicit occlusion model
yields a detection system that outperforms existing approaches for occluded
instances while maintaining competitive accuracy in detection and landmark
localization for unoccluded instances
Bosphorus database for 3d face analysis
Abstract. A new 3D face database that includes a rich set of expressions, systematic variation of poses and different types of occlusions is presented in this paper. This database is unique from three aspects: i) the facial expressions are composed of judiciously selected subset of Action Units as well as the six basic emotions, and many actors/actresses are incorporated to obtain more realistic expression data; ii) a rich set of head pose variations are available; and iii) different types of face occlusions are included. Hence, this new database can be a very valuable resource for development and evaluation of algorithms on face recognition under adverse conditions and facial expression analysis as well as for facial expression synthesis. 1
UV-GAN: Adversarial Facial UV Map Completion for Pose-invariant Face Recognition
Recently proposed robust 3D face alignment methods establish either dense or
sparse correspondence between a 3D face model and a 2D facial image. The use of
these methods presents new challenges as well as opportunities for facial
texture analysis. In particular, by sampling the image using the fitted model,
a facial UV can be created. Unfortunately, due to self-occlusion, such a UV map
is always incomplete. In this paper, we propose a framework for training Deep
Convolutional Neural Network (DCNN) to complete the facial UV map extracted
from in-the-wild images. To this end, we first gather complete UV maps by
fitting a 3D Morphable Model (3DMM) to various multiview image and video
datasets, as well as leveraging on a new 3D dataset with over 3,000 identities.
Second, we devise a meticulously designed architecture that combines local and
global adversarial DCNNs to learn an identity-preserving facial UV completion
model. We demonstrate that by attaching the completed UV to the fitted mesh and
generating instances of arbitrary poses, we can increase pose variations for
training deep face recognition/verification models, and minimise pose
discrepancy during testing, which lead to better performance. Experiments on
both controlled and in-the-wild UV datasets prove the effectiveness of our
adversarial UV completion model. We achieve state-of-the-art verification
accuracy, , under the CFP frontal-profile protocol only by combining
pose augmentation during training and pose discrepancy reduction during
testing. We will release the first in-the-wild UV dataset (we refer as WildUV)
that comprises of complete facial UV maps from 1,892 identities for research
purposes
When 3D-Aided 2D Face Recognition Meets Deep Learning: An extended UR2D for Pose-Invariant Face Recognition
Most of the face recognition works focus on specific modules or demonstrate a
research idea. This paper presents a pose-invariant 3D-aided 2D face
recognition system (UR2D) that is robust to pose variations as large as 90? by
leveraging deep learning technology. The architecture and the interface of UR2D
are described, and each module is introduced in detail. Extensive experiments
are conducted on the UHDB31 and IJB-A, demonstrating that UR2D outperforms
existing 2D face recognition systems such as VGG-Face, FaceNet, and a
commercial off-the-shelf software (COTS) by at least 9% on the UHDB31 dataset
and 3% on the IJB-A dataset on average in face identification tasks. UR2D also
achieves state-of-the-art performance of 85% on the IJB-A dataset by comparing
the Rank-1 accuracy score from template matching. It fills a gap by providing a
3D-aided 2D face recognition system that has compatible results with 2D face
recognition systems using deep learning techniques.Comment: Submitted to Special Issue on Biometrics in the Wild, Image and
Vision Computin
- …