10 research outputs found

    A Review on Face Anti-Spoofing

    Get PDF
    The biometric system is a security technology that uses information based on a living person's characteristics to verify or recognize the identity, such as facial recognition. Face recognition has numerous applications in the real world, such as access control and surveillance. But face recognition has a security issue of spoofing. A face anti-spoofing, a task to prevent fake authorization by breaching the face recognition systems using a photo, video, mask, or a different substitute for an authorized person's face, is used to overcome this challenge. There is also increasing research of new datasets by providing new types of attack or diversity to reach a better generalization. This paper review of the recent development includes a general understanding of face spoofing, anti-spoofing methods, and the latest development to solve the problem against various spoof types

    PipeNet: Selective Modal Pipeline of Fusion Network for Multi-Modal Face Anti-Spoofing

    Full text link
    Face anti-spoofing has become an increasingly important and critical security feature for authentication systems, due to rampant and easily launchable presentation attacks. Addressing the shortage of multi-modal face dataset, CASIA recently released the largest up-to-date CASIA-SURF Cross-ethnicity Face Anti-spoofing(CeFA) dataset, covering 3 ethnicities, 3 modalities, 1607 subjects, and 2D plus 3D attack types in four protocols, and focusing on the challenge of improving the generalization capability of face anti-spoofing in cross-ethnicity and multi-modal continuous data. In this paper, we propose a novel pipeline-based multi-stream CNN architecture called PipeNet for multi-modal face anti-spoofing. Unlike previous works, Selective Modal Pipeline (SMP) is designed to enable a customized pipeline for each data modality to take full advantage of multi-modal data. Limited Frame Vote (LFV) is designed to ensure stable and accurate prediction for video classification. The proposed method wins the third place in the final ranking of Chalearn Multi-modal Cross-ethnicity Face Anti-spoofing Recognition Challenge@CVPR2020. Our final submission achieves the Average Classification Error Rate (ACER) of 2.21 with Standard Deviation of 1.26 on the test set.Comment: Accepted to appear in CVPR2020 WM

    Benchmarking Joint Face Spoofing and Forgery Detection with Visual and Physiological Cues

    Full text link
    Face anti-spoofing (FAS) and face forgery detection play vital roles in securing face biometric systems from presentation attacks (PAs) and vicious digital manipulation (e.g., deepfakes). Despite promising performance upon large-scale data and powerful deep models, the generalization problem of existing approaches is still an open issue. Most of recent approaches focus on 1) unimodal visual appearance or physiological (i.e., remote photoplethysmography (rPPG)) cues; and 2) separated feature representation for FAS or face forgery detection. On one side, unimodal appearance and rPPG features are respectively vulnerable to high-fidelity face 3D mask and video replay attacks, inspiring us to design reliable multi-modal fusion mechanisms for generalized face attack detection. On the other side, there are rich common features across FAS and face forgery detection tasks (e.g., periodic rPPG rhythms and vanilla appearance for bonafides), providing solid evidence to design a joint FAS and face forgery detection system in a multi-task learning fashion. In this paper, we establish the first joint face spoofing and forgery detection benchmark using both visual appearance and physiological rPPG cues. To enhance the rPPG periodicity discrimination, we design a two-branch physiological network using both facial spatio-temporal rPPG signal map and its continuous wavelet transformed counterpart as inputs. To mitigate the modality bias and improve the fusion efficacy, we conduct a weighted batch and layer normalization for both appearance and rPPG features before multi-modal fusion. We find that the generalization capacities of both unimodal (appearance or rPPG) and multi-modal (appearance+rPPG) models can be obviously improved via joint training on these two tasks. We hope this new benchmark will facilitate the future research of both FAS and deepfake detection communities.Comment: Accepted by IEEE Transactions on Dependable and Secure Computing (TDSC). Corresponding authors: Zitong Yu and Wenhan Yan

    Taming Self-Supervised Learning for Presentation Attack Detection: De-Folding and De-Mixing

    Full text link
    Biometric systems are vulnerable to Presentation Attacks (PA) performed using various Presentation Attack Instruments (PAIs). Even though there are numerous Presentation Attack Detection (PAD) techniques based on both deep learning and hand-crafted features, the generalization of PAD for unknown PAI is still a challenging problem. In this work, we empirically prove that the initialization of the PAD model is a crucial factor for the generalization, which is rarely discussed in the community. Based on such observation, we proposed a self-supervised learning-based method, denoted as DF-DM. Specifically, DF-DM is based on a global-local view coupled with De-Folding and De-Mixing to derive the task-specific representation for PAD. During De-Folding, the proposed technique will learn region-specific features to represent samples in a local pattern by explicitly minimizing generative loss. While De-Mixing drives detectors to obtain the instance-specific features with global information for more comprehensive representation by minimizing interpolation-based consistency. Extensive experimental results show that the proposed method can achieve significant improvements in terms of both face and fingerprint PAD in more complicated and hybrid datasets when compared with state-of-the-art methods. When training in CASIA-FASD and Idiap Replay-Attack, the proposed method can achieve an 18.60% Equal Error Rate (EER) in OULU-NPU and MSU-MFSD, exceeding baseline performance by 9.54%. The source code of the proposed technique is available at https://github.com/kongzhecn/dfdm.Comment: Accepted by IEEE Transactions on Neural Networks and Learning Systems (TNNLS

    STIDNet: Identity-Aware Face Forgery Detection with Spatiotemporal Knowledge Distillation

    Get PDF
    The impressive development of facial manipulation techniques has raised severe public concerns. Identity-aware methods, especially suitable for protecting celebrities, are seen as one of promising face forgery detection approaches with additional reference video. However, without in-depth observation of fake video’s characteristics, most existing identity-aware algorithms are just naive imitation of face verification model and fail to exploit discriminative information. In this article, we argue that it is necessary to take both spatial and temporal perspectives into consideration for adequate inconsistency clues and propose a novel forgery detector named SpatioTemporal IDentity network (STIDNet). To effectively capture heterogeneous spatiotemporal information in a unified formulation, our STIDNet is following a knowledge distillation architecture that the student identity extractor receives supervision from a spatial information encoder (SIE) and a temporal information encoder (TIE) through multiteacher training. Specifically, a regional sensitive identity modelling paradigm is proposed in SIE by introducing facial blending augmentation but with uniform identity label, thus encourage model to focus on spatial discriminative region like outer face. Meanwhile, considering the strong temporal correlation between audio and talking face video, our TIE is devised in a cross-modal pattern that the audio information is introduced to supervise model exploiting temporal personalized movements. Benefit from knowledge transfer from SIE and TIE, STIDNet is able to capture individual’s essential spatiotemporal identity attributes and sensitive to even subtle identity deviation caused by manipulation. Extensive experiments indicate the superiority of our STIDNet compared with previous works. Moreover, we also demonstrate STIDNet is more suitable for real-world implementation in terms of model complexity and reference set size

    Deep spatial gradient and temporal depth learning for face anti-spoofing

    No full text
    Abstract Face anti-spoofing is critical to the security of face recognition systems. Depth supervised learning has been proven as one of the most effective methods for face anti-spoofing. Despite the great success, most previous works still formulate the problem as a single-frame multi-task one by simply augmenting the loss with depth, while neglecting the detailed fine-grained information and the interplay between facial depths and moving patterns. In contrast, we design a new approach to detect presentation attacks from multiple frames based on two insights: 1) detailed discriminative clues (e.g., spatial gradient magnitude) between living and spoofing face may be discarded through stacked vanilla convolutions, and 2) the dynamics of 3D moving faces provide important clues in detecting the spoofing faces. The proposed method is able to capture discriminative details via Residual Spatial Gradient Block (RSGB) and encode spatio-temporal information from Spatio-Temporal Propagation Module (STPM) efficiently. Moreover, a novel Contrastive Depth Loss is presented for more accurate depth supervision. To assess the efficacy of our method, we also collect a Double-modal Anti-spoofing Dataset (DMAD) which provides actual depth for each sample. The experiments demonstrate that the proposed approach achieves state-of-the-art results on five benchmark datasets including OULU-NPU, SiW, CASIA-MFSD, Replay-Attack, and the new DMAD. Codes will be available at https://github.com/clks-wzz/FAS-SGTD
    corecore