1,138 research outputs found
Learning Meta Model for Zero- and Few-shot Face Anti-spoofing
Face anti-spoofing is crucial to the security of face recognition systems.
Most previous methods formulate face anti-spoofing as a supervised learning
problem to detect various predefined presentation attacks, which need large
scale training data to cover as many attacks as possible. However, the trained
model is easy to overfit several common attacks and is still vulnerable to
unseen attacks. To overcome this challenge, the detector should: 1) learn
discriminative features that can generalize to unseen spoofing types from
predefined presentation attacks; 2) quickly adapt to new spoofing types by
learning from both the predefined attacks and a few examples of the new
spoofing types. Therefore, we define face anti-spoofing as a zero- and few-shot
learning problem. In this paper, we propose a novel Adaptive Inner-update Meta
Face Anti-Spoofing (AIM-FAS) method to tackle this problem through
meta-learning. Specifically, AIM-FAS trains a meta-learner focusing on the task
of detecting unseen spoofing types by learning from predefined living and
spoofing faces and a few examples of new attacks. To assess the proposed
approach, we propose several benchmarks for zero- and few-shot FAS. Experiments
show its superior performances on the presented benchmarks to existing methods
in existing zero-shot FAS protocols.Comment: Accepted by AAAI202
S-Adapter: Generalizing Vision Transformer for Face Anti-Spoofing with Statistical Tokens
Face Anti-Spoofing (FAS) aims to detect malicious attempts to invade a face
recognition system by presenting spoofed faces. State-of-the-art FAS techniques
predominantly rely on deep learning models but their cross-domain
generalization capabilities are often hindered by the domain shift problem,
which arises due to different distributions between training and testing data.
In this study, we develop a generalized FAS method under the Efficient
Parameter Transfer Learning (EPTL) paradigm, where we adapt the pre-trained
Vision Transformer models for the FAS task. During training, the adapter
modules are inserted into the pre-trained ViT model, and the adapters are
updated while other pre-trained parameters remain fixed. We find the
limitations of previous vanilla adapters in that they are based on linear
layers, which lack a spoofing-aware inductive bias and thus restrict the
cross-domain generalization. To address this limitation and achieve
cross-domain generalized FAS, we propose a novel Statistical Adapter
(S-Adapter) that gathers local discriminative and statistical information from
localized token histograms. To further improve the generalization of the
statistical tokens, we propose a novel Token Style Regularization (TSR), which
aims to reduce domain style variance by regularizing Gram matrices extracted
from tokens across different domains. Our experimental results demonstrate that
our proposed S-Adapter and TSR provide significant benefits in both zero-shot
and few-shot cross-domain testing, outperforming state-of-the-art methods on
several benchmark tests. We will release the source code upon acceptance
Deep Learning for Face Anti-Spoofing: A Survey
Face anti-spoofing (FAS) has lately attracted increasing attention due to its
vital role in securing face recognition systems from presentation attacks
(PAs). As more and more realistic PAs with novel types spring up, traditional
FAS methods based on handcrafted features become unreliable due to their
limited representation capacity. With the emergence of large-scale academic
datasets in the recent decade, deep learning based FAS achieves remarkable
performance and dominates this area. However, existing reviews in this field
mainly focus on the handcrafted features, which are outdated and uninspiring
for the progress of FAS community. In this paper, to stimulate future research,
we present the first comprehensive review of recent advances in deep learning
based FAS. It covers several novel and insightful components: 1) besides
supervision with binary label (e.g., '0' for bonafide vs. '1' for PAs), we also
investigate recent methods with pixel-wise supervision (e.g., pseudo depth
map); 2) in addition to traditional intra-dataset evaluation, we collect and
analyze the latest methods specially designed for domain generalization and
open-set FAS; and 3) besides commercial RGB camera, we summarize the deep
learning applications under multi-modal (e.g., depth and infrared) or
specialized (e.g., light field and flash) sensors. We conclude this survey by
emphasizing current open issues and highlighting potential prospects.Comment: IEEE Transactions on Pattern Analysis and Machine Intelligence
(TPAMI
Unmasking the imposters: towards improving the generalisation of deep learning methods for face presentation attack detection.
Identity theft has had a detrimental impact on the reliability of face recognition, which has been extensively employed in security applications. The most prevalent are presentation attacks. By using a photo, video, or mask of an authorized user, attackers can bypass face recognition systems. Fake presentation attacks are detected by the camera sensors of face recognition systems using face presentation attack detection. Presentation attacks can be detected using convolutional neural networks, commonly used in computer vision applications. An in-depth analysis of current deep learning methods is used in this research to examine various aspects of detecting face presentation attacks. A number of new techniques are implemented and evaluated in this study, including pre-trained models, manual feature extraction, and data aggregation. The thesis explores the effectiveness of various machine learning and deep learning models in improving detection performance by using publicly available datasets with different dataset partitions than those specified in the official dataset protocol. Furthermore, the research investigates how deep models and data aggregation can be used to detect face presentation attacks, as well as a novel approach that combines manual features with deep features in order to improve detection accuracy. Moreover, task-specific features are also extracted using pre-trained deep models to enhance the performance of detection and generalisation further. This problem is motivated by the need to achieve generalization against new and rapidly evolving attack variants. It is possible to extract identifiable features from presentation attack variants in order to detect them. However, new methods are needed to deal with emerging attacks and improve the generalization capability. This thesis examines the necessary measures to detect face presentation attacks in a more robust and generalised manner
Dual Teacher Knowledge Distillation with Domain Alignment for Face Anti-spoofing
Face recognition systems have raised concerns due to their vulnerability to
different presentation attacks, and system security has become an increasingly
critical concern. Although many face anti-spoofing (FAS) methods perform well
in intra-dataset scenarios, their generalization remains a challenge. To
address this issue, some methods adopt domain adversarial training (DAT) to
extract domain-invariant features. However, the competition between the encoder
and the domain discriminator can cause the network to be difficult to train and
converge. In this paper, we propose a domain adversarial attack (DAA) method to
mitigate the training instability problem by adding perturbations to the input
images, which makes them indistinguishable across domains and enables domain
alignment. Moreover, since models trained on limited data and types of attacks
cannot generalize well to unknown attacks, we propose a dual perceptual and
generative knowledge distillation framework for face anti-spoofing that
utilizes pre-trained face-related models containing rich face priors.
Specifically, we adopt two different face-related models as teachers to
transfer knowledge to the target student model. The pre-trained teacher models
are not from the task of face anti-spoofing but from perceptual and generative
tasks, respectively, which implicitly augment the data. By combining both DAA
and dual-teacher knowledge distillation, we develop a dual teacher knowledge
distillation with domain alignment framework (DTDA) for face anti-spoofing. The
advantage of our proposed method has been verified through extensive ablation
studies and comparison with state-of-the-art methods on public datasets across
multiple protocols
Instance-Aware Domain Generalization for Face Anti-Spoofing
Face anti-spoofing (FAS) based on domain generalization (DG) has been
recently studied to improve the generalization on unseen scenarios. Previous
methods typically rely on domain labels to align the distribution of each
domain for learning domain-invariant representations. However, artificial
domain labels are coarse-grained and subjective, which cannot reflect real
domain distributions accurately. Besides, such domain-aware methods focus on
domain-level alignment, which is not fine-grained enough to ensure that learned
representations are insensitive to domain styles. To address these issues, we
propose a novel perspective for DG FAS that aligns features on the instance
level without the need for domain labels. Specifically, Instance-Aware Domain
Generalization framework is proposed to learn the generalizable feature by
weakening the features' sensitivity to instance-specific styles. Concretely, we
propose Asymmetric Instance Adaptive Whitening to adaptively eliminate the
style-sensitive feature correlation, boosting the generalization. Moreover,
Dynamic Kernel Generator and Categorical Style Assembly are proposed to first
extract the instance-specific features and then generate the style-diversified
features with large style shifts, respectively, further facilitating the
learning of style-insensitive features. Extensive experiments and analysis
demonstrate the superiority of our method over state-of-the-art competitors.
Code will be publicly available at https://github.com/qianyuzqy/IADG.Comment: Accepted to IEEE/CVF Conference on Computer Vision and Pattern
Recognition (CVPR), 202
FLIP: Cross-domain Face Anti-spoofing with Language Guidance
Face anti-spoofing (FAS) or presentation attack detection is an essential
component of face recognition systems deployed in security-critical
applications. Existing FAS methods have poor generalizability to unseen spoof
types, camera sensors, and environmental conditions. Recently, vision
transformer (ViT) models have been shown to be effective for the FAS task due
to their ability to capture long-range dependencies among image patches.
However, adaptive modules or auxiliary loss functions are often required to
adapt pre-trained ViT weights learned on large-scale datasets such as ImageNet.
In this work, we first show that initializing ViTs with multimodal (e.g., CLIP)
pre-trained weights improves generalizability for the FAS task, which is in
line with the zero-shot transfer capabilities of vision-language pre-trained
(VLP) models. We then propose a novel approach for robust cross-domain FAS by
grounding visual representations with the help of natural language.
Specifically, we show that aligning the image representation with an ensemble
of class descriptions (based on natural language semantics) improves FAS
generalizability in low-data regimes. Finally, we propose a multimodal
contrastive learning strategy to boost feature generalization further and
bridge the gap between source and target domains. Extensive experiments on
three standard protocols demonstrate that our method significantly outperforms
the state-of-the-art methods, achieving better zero-shot transfer performance
than five-shot transfer of adaptive ViTs. Code:
https://github.com/koushiksrivats/FLIPComment: Accepted to ICCV-2023. Project Page:
https://koushiksrivats.github.io/FLIP
- …