498 research outputs found
SimFLE: Simple Facial Landmark Encoding for Self-Supervised Facial Expression Recognition in the Wild
One of the key issues in facial expression recognition in the wild (FER-W) is
that curating large-scale labeled facial images is challenging due to the
inherent complexity and ambiguity of facial images. Therefore, in this paper,
we propose a self-supervised simple facial landmark encoding (SimFLE) method
that can learn effective encoding of facial landmarks, which are important
features for improving the performance of FER-W, without expensive labels.
Specifically, we introduce novel FaceMAE module for this purpose. FaceMAE
reconstructs masked facial images with elaborately designed semantic masking.
Unlike previous random masking, semantic masking is conducted based on channel
information processed in the backbone, so rich semantics of channels can be
explored. Additionally, the semantic masking process is fully trainable,
enabling FaceMAE to guide the backbone to learn spatial details and contextual
properties of fine-grained facial landmarks. Experimental results on several
FER-W benchmarks prove that the proposed SimFLE is superior in facial landmark
localization and noticeably improved performance compared to the supervised
baseline and other self-supervised methods
AnchorFace: An Anchor-based Facial Landmark Detector Across Large Poses
Facial landmark localization aims to detect the predefined points of human
faces, and the topic has been rapidly improved with the recent development of
neural network based methods. However, it remains a challenging task when
dealing with faces in unconstrained scenarios, especially with large pose
variations. In this paper, we target the problem of facial landmark
localization across large poses and address this task based on a
split-and-aggregate strategy. To split the search space, we propose a set of
anchor templates as references for regression, which well addresses the large
variations of face poses. Based on the prediction of each anchor template, we
propose to aggregate the results, which can reduce the landmark uncertainty due
to the large poses. Overall, our proposed approach, named AnchorFace, obtains
state-of-the-art results with extremely efficient inference speed on four
challenging benchmarks, i.e. AFLW, 300W, Menpo, and WFLW dataset. Code will be
available at https://github.com/nothingelse92/AnchorFace.Comment: To appear in AAAI 202
- …