429 research outputs found
Deep Learning for Face Anti-Spoofing: A Survey
Face anti-spoofing (FAS) has lately attracted increasing attention due to its
vital role in securing face recognition systems from presentation attacks
(PAs). As more and more realistic PAs with novel types spring up, traditional
FAS methods based on handcrafted features become unreliable due to their
limited representation capacity. With the emergence of large-scale academic
datasets in the recent decade, deep learning based FAS achieves remarkable
performance and dominates this area. However, existing reviews in this field
mainly focus on the handcrafted features, which are outdated and uninspiring
for the progress of FAS community. In this paper, to stimulate future research,
we present the first comprehensive review of recent advances in deep learning
based FAS. It covers several novel and insightful components: 1) besides
supervision with binary label (e.g., '0' for bonafide vs. '1' for PAs), we also
investigate recent methods with pixel-wise supervision (e.g., pseudo depth
map); 2) in addition to traditional intra-dataset evaluation, we collect and
analyze the latest methods specially designed for domain generalization and
open-set FAS; and 3) besides commercial RGB camera, we summarize the deep
learning applications under multi-modal (e.g., depth and infrared) or
specialized (e.g., light field and flash) sensors. We conclude this survey by
emphasizing current open issues and highlighting potential prospects.Comment: IEEE Transactions on Pattern Analysis and Machine Intelligence
(TPAMI
PipeNet: Selective Modal Pipeline of Fusion Network for Multi-Modal Face Anti-Spoofing
Face anti-spoofing has become an increasingly important and critical security
feature for authentication systems, due to rampant and easily launchable
presentation attacks. Addressing the shortage of multi-modal face dataset,
CASIA recently released the largest up-to-date CASIA-SURF Cross-ethnicity Face
Anti-spoofing(CeFA) dataset, covering 3 ethnicities, 3 modalities, 1607
subjects, and 2D plus 3D attack types in four protocols, and focusing on the
challenge of improving the generalization capability of face anti-spoofing in
cross-ethnicity and multi-modal continuous data. In this paper, we propose a
novel pipeline-based multi-stream CNN architecture called PipeNet for
multi-modal face anti-spoofing. Unlike previous works, Selective Modal Pipeline
(SMP) is designed to enable a customized pipeline for each data modality to
take full advantage of multi-modal data. Limited Frame Vote (LFV) is designed
to ensure stable and accurate prediction for video classification. The proposed
method wins the third place in the final ranking of Chalearn Multi-modal
Cross-ethnicity Face Anti-spoofing Recognition Challenge@CVPR2020. Our final
submission achieves the Average Classification Error Rate (ACER) of 2.21 with
Standard Deviation of 1.26 on the test set.Comment: Accepted to appear in CVPR2020 WM
A Review on Face Anti-Spoofing
The biometric system is a security technology that uses information based on a living person's characteristics to verify or recognize the identity, such as facial recognition. Face recognition has numerous applications in the real world, such as access control and surveillance. But face recognition has a security issue of spoofing. A face anti-spoofing, a task to prevent fake authorization by breaching the face recognition systems using a photo, video, mask, or a different substitute for an authorized person's face, is used to overcome this challenge. There is also increasing research of new datasets by providing new types of attack or diversity to reach a better generalization. This paper review of the recent development includes a general understanding of face spoofing, anti-spoofing methods, and the latest development to solve the problem against various spoof types
Benchmarking Joint Face Spoofing and Forgery Detection with Visual and Physiological Cues
Face anti-spoofing (FAS) and face forgery detection play vital roles in
securing face biometric systems from presentation attacks (PAs) and vicious
digital manipulation (e.g., deepfakes). Despite promising performance upon
large-scale data and powerful deep models, the generalization problem of
existing approaches is still an open issue. Most of recent approaches focus on
1) unimodal visual appearance or physiological (i.e., remote
photoplethysmography (rPPG)) cues; and 2) separated feature representation for
FAS or face forgery detection. On one side, unimodal appearance and rPPG
features are respectively vulnerable to high-fidelity face 3D mask and video
replay attacks, inspiring us to design reliable multi-modal fusion mechanisms
for generalized face attack detection. On the other side, there are rich common
features across FAS and face forgery detection tasks (e.g., periodic rPPG
rhythms and vanilla appearance for bonafides), providing solid evidence to
design a joint FAS and face forgery detection system in a multi-task learning
fashion. In this paper, we establish the first joint face spoofing and forgery
detection benchmark using both visual appearance and physiological rPPG cues.
To enhance the rPPG periodicity discrimination, we design a two-branch
physiological network using both facial spatio-temporal rPPG signal map and its
continuous wavelet transformed counterpart as inputs. To mitigate the modality
bias and improve the fusion efficacy, we conduct a weighted batch and layer
normalization for both appearance and rPPG features before multi-modal fusion.
We find that the generalization capacities of both unimodal (appearance or
rPPG) and multi-modal (appearance+rPPG) models can be obviously improved via
joint training on these two tasks. We hope this new benchmark will facilitate
the future research of both FAS and deepfake detection communities.Comment: Accepted by IEEE Transactions on Dependable and Secure Computing
(TDSC). Corresponding authors: Zitong Yu and Wenhan Yan
Hyperbolic Face Anti-Spoofing
Learning generalized face anti-spoofing (FAS) models against presentation
attacks is essential for the security of face recognition systems. Previous FAS
methods usually encourage models to extract discriminative features, of which
the distances within the same class (bonafide or attack) are pushed close while
those between bonafide and attack are pulled away. However, these methods are
designed based on Euclidean distance, which lacks generalization ability for
unseen attack detection due to poor hierarchy embedding ability. According to
the evidence that different spoofing attacks are intrinsically hierarchical, we
propose to learn richer hierarchical and discriminative spoofing cues in
hyperbolic space. Specifically, for unimodal FAS learning, the feature
embeddings are projected into the Poincar\'e ball, and then the hyperbolic
binary logistic regression layer is cascaded for classification. To further
improve generalization, we conduct hyperbolic contrastive learning for the
bonafide only while relaxing the constraints on diverse spoofing attacks. To
alleviate the vanishing gradient problem in hyperbolic space, a new feature
clipping method is proposed to enhance the training stability of hyperbolic
models. Besides, we further design a multimodal FAS framework with Euclidean
multimodal feature decomposition and hyperbolic multimodal feature fusion &
classification. Extensive experiments on three benchmark datasets (i.e., WMCA,
PADISI-Face, and SiW-M) with diverse attack types demonstrate that the proposed
method can bring significant improvement compared to the Euclidean baselines on
unseen attack detection. In addition, the proposed framework is also
generalized well on four benchmark datasets (i.e., MSU-MFSD, IDIAP
REPLAY-ATTACK, CASIA-FASD, and OULU-NPU) with a limited number of attack types
Unmasking the imposters: towards improving the generalisation of deep learning methods for face presentation attack detection.
Identity theft has had a detrimental impact on the reliability of face recognition, which has been extensively employed in security applications. The most prevalent are presentation attacks. By using a photo, video, or mask of an authorized user, attackers can bypass face recognition systems. Fake presentation attacks are detected by the camera sensors of face recognition systems using face presentation attack detection. Presentation attacks can be detected using convolutional neural networks, commonly used in computer vision applications. An in-depth analysis of current deep learning methods is used in this research to examine various aspects of detecting face presentation attacks. A number of new techniques are implemented and evaluated in this study, including pre-trained models, manual feature extraction, and data aggregation. The thesis explores the effectiveness of various machine learning and deep learning models in improving detection performance by using publicly available datasets with different dataset partitions than those specified in the official dataset protocol. Furthermore, the research investigates how deep models and data aggregation can be used to detect face presentation attacks, as well as a novel approach that combines manual features with deep features in order to improve detection accuracy. Moreover, task-specific features are also extracted using pre-trained deep models to enhance the performance of detection and generalisation further. This problem is motivated by the need to achieve generalization against new and rapidly evolving attack variants. It is possible to extract identifiable features from presentation attack variants in order to detect them. However, new methods are needed to deal with emerging attacks and improve the generalization capability. This thesis examines the necessary measures to detect face presentation attacks in a more robust and generalised manner
- …