16 research outputs found
Hand2Face: Automatic Synthesis and Recognition of Hand Over Face Occlusions
A person's face discloses important information about their affective state.
Although there has been extensive research on recognition of facial
expressions, the performance of existing approaches is challenged by facial
occlusions. Facial occlusions are often treated as noise and discarded in
recognition of affective states. However, hand over face occlusions can provide
additional information for recognition of some affective states such as
curiosity, frustration and boredom. One of the reasons that this problem has
not gained attention is the lack of naturalistic occluded faces that contain
hand over face occlusions as well as other types of occlusions. Traditional
approaches for obtaining affective data are time demanding and expensive, which
limits researchers in affective computing to work on small datasets. This
limitation affects the generalizability of models and deprives researchers from
taking advantage of recent advances in deep learning that have shown great
success in many fields but require large volumes of data. In this paper, we
first introduce a novel framework for synthesizing naturalistic facial
occlusions from an initial dataset of non-occluded faces and separate images of
hands, reducing the costly process of data collection and annotation. We then
propose a model for facial occlusion type recognition to differentiate between
hand over face occlusions and other types of occlusions such as scarves, hair,
glasses and objects. Finally, we present a model to localize hand over face
occlusions and identify the occluded regions of the face.Comment: Accepted to International Conference on Affective Computing and
Intelligent Interaction (ACII), 201
Bottom-Up and Top-Down Reasoning with Hierarchical Rectified Gaussians
Convolutional neural nets (CNNs) have demonstrated remarkable performance in
recent history. Such approaches tend to work in a unidirectional bottom-up
feed-forward fashion. However, practical experience and biological evidence
tells us that feedback plays a crucial role, particularly for detailed spatial
understanding tasks. This work explores bidirectional architectures that also
reason with top-down feedback: neural units are influenced by both lower and
higher-level units.
We do so by treating units as rectified latent variables in a quadratic
energy function, which can be seen as a hierarchical Rectified Gaussian model
(RGs). We show that RGs can be optimized with a quadratic program (QP), that
can in turn be optimized with a recurrent neural network (with rectified linear
units). This allows RGs to be trained with GPU-optimized gradient descent. From
a theoretical perspective, RGs help establish a connection between CNNs and
hierarchical probabilistic models. From a practical perspective, RGs are well
suited for detailed spatial tasks that can benefit from top-down reasoning. We
illustrate them on the challenging task of keypoint localization under
occlusions, where local bottom-up evidence may be misleading. We demonstrate
state-of-the-art results on challenging benchmarks.Comment: To appear in CVPR 201
Simultaneous Facial Landmark Detection, Pose and Deformation Estimation under Facial Occlusion
Facial landmark detection, head pose estimation, and facial deformation
analysis are typical facial behavior analysis tasks in computer vision. The
existing methods usually perform each task independently and sequentially,
ignoring their interactions. To tackle this problem, we propose a unified
framework for simultaneous facial landmark detection, head pose estimation, and
facial deformation analysis, and the proposed model is robust to facial
occlusion. Following a cascade procedure augmented with model-based head pose
estimation, we iteratively update the facial landmark locations, facial
occlusion, head pose and facial de- formation until convergence. The
experimental results on benchmark databases demonstrate the effectiveness of
the proposed method for simultaneous facial landmark detection, head pose and
facial deformation estimation, even if the images are under facial occlusion.Comment: International Conference on Computer Vision and Pattern Recognition,
201