379 research outputs found
The Menpo benchmark for multi-pose 2D and 3D facial landmark localisation and tracking
In this article, we present the Menpo 2D and Menpo 3D benchmarks, two new datasets for multi-pose 2D and 3D facial landmark localisation and tracking. In contrast to the previous benchmarks such as 300W and 300VW, the proposed benchmarks contain facial images in both semi-frontal and profile pose. We introduce an elaborate semi-automatic methodology for providing high-quality annotations for both the Menpo 2D and Menpo 3D benchmarks. In Menpo 2D benchmark, different visible landmark configurations are designed for semi-frontal and profile faces, thus making the 2D face alignment full-pose. In Menpo 3D benchmark, a united landmark configuration is designed for both semi-frontal and profile faces based on the correspondence with a 3D face model, thus making face alignment not only full-pose but also corresponding to the real-world 3D space. Based on the considerable number of annotated images, we organised Menpo 2D Challenge and Menpo 3D Challenge for face alignment under large pose variations in conjunction with CVPR 2017 and ICCV 2017, respectively. The results of these challenges demonstrate that recent deep learning architectures, when trained with the abundant data, lead to excellent results. We also provide a very simple, yet effective solution, named Cascade Multi-view Hourglass Model, to 2D and 3D face alignment. In our method, we take advantage of all 2D and 3D facial landmark annotations in a joint way. We not only capitalise on the correspondences between the semi-frontal and profile 2D facial landmarks but also employ joint supervision from both 2D and 3D facial landmarks. Finally, we discuss future directions on the topic of face alignment
The Menpo benchmark for multi-pose 2D and 3D facial landmark localisation and tracking
In this article, we present the Menpo 2D and Menpo 3D benchmarks, two new datasets for multi-pose 2D and 3D facial landmark localisation and tracking. In contrast to the previous benchmarks such as 300W and 300VW, the proposed benchmarks contain facial images in both semi-frontal and profile pose. We introduce an elaborate semi-automatic methodology for providing high-quality annotations for both the Menpo 2D and Menpo 3D benchmarks. In Menpo 2D benchmark, different visible landmark configurations are designed for semi-frontal and profile faces, thus making the 2D face alignment full-pose. In Menpo 3D benchmark, a united landmark configuration is designed for both semi-frontal and profile faces based on the correspondence with a 3D face model, thus making face alignment not only full-pose but also corresponding to the real-world 3D space. Based on the considerable number of annotated images, we organised Menpo 2D Challenge and Menpo 3D Challenge for face alignment under large pose variations in conjunction with CVPR 2017 and ICCV 2017, respectively. The results of these challenges demonstrate that recent deep learning architectures, when trained with the abundant data, lead to excellent results. We also provide a very simple, yet effective solution, named Cascade Multi-view Hourglass Model, to 2D and 3D face alignment. In our method, we take advantage of all 2D and 3D facial landmark annotations in a joint way. We not only capitalise on the correspondences between the semi-frontal and profile 2D facial landmarks but also employ joint supervision from both 2D and 3D facial landmarks. Finally, we discuss future directions on the topic of face alignment
Fiducial Focus Augmentation for Facial Landmark Detection
Deep learning methods have led to significant improvements in the performance
on the facial landmark detection (FLD) task. However, detecting landmarks in
challenging settings, such as head pose changes, exaggerated expressions, or
uneven illumination, continue to remain a challenge due to high variability and
insufficient samples. This inadequacy can be attributed to the model's
inability to effectively acquire appropriate facial structure information from
the input images. To address this, we propose a novel image augmentation
technique specifically designed for the FLD task to enhance the model's
understanding of facial structures. To effectively utilize the newly proposed
augmentation technique, we employ a Siamese architecture-based training
mechanism with a Deep Canonical Correlation Analysis (DCCA)-based loss to
achieve collective learning of high-level feature representations from two
different views of the input images. Furthermore, we employ a Transformer +
CNN-based network with a custom hourglass module as the robust backbone for the
Siamese framework. Extensive experiments show that our approach outperforms
multiple state-of-the-art approaches across various benchmark datasets.Comment: Accepted to BMVC'2
- …