340 research outputs found
Learning Meta Model for Zero- and Few-shot Face Anti-spoofing
Face anti-spoofing is crucial to the security of face recognition systems.
Most previous methods formulate face anti-spoofing as a supervised learning
problem to detect various predefined presentation attacks, which need large
scale training data to cover as many attacks as possible. However, the trained
model is easy to overfit several common attacks and is still vulnerable to
unseen attacks. To overcome this challenge, the detector should: 1) learn
discriminative features that can generalize to unseen spoofing types from
predefined presentation attacks; 2) quickly adapt to new spoofing types by
learning from both the predefined attacks and a few examples of the new
spoofing types. Therefore, we define face anti-spoofing as a zero- and few-shot
learning problem. In this paper, we propose a novel Adaptive Inner-update Meta
Face Anti-Spoofing (AIM-FAS) method to tackle this problem through
meta-learning. Specifically, AIM-FAS trains a meta-learner focusing on the task
of detecting unseen spoofing types by learning from predefined living and
spoofing faces and a few examples of new attacks. To assess the proposed
approach, we propose several benchmarks for zero- and few-shot FAS. Experiments
show its superior performances on the presented benchmarks to existing methods
in existing zero-shot FAS protocols.Comment: Accepted by AAAI202
S-Adapter: Generalizing Vision Transformer for Face Anti-Spoofing with Statistical Tokens
Face Anti-Spoofing (FAS) aims to detect malicious attempts to invade a face
recognition system by presenting spoofed faces. State-of-the-art FAS techniques
predominantly rely on deep learning models but their cross-domain
generalization capabilities are often hindered by the domain shift problem,
which arises due to different distributions between training and testing data.
In this study, we develop a generalized FAS method under the Efficient
Parameter Transfer Learning (EPTL) paradigm, where we adapt the pre-trained
Vision Transformer models for the FAS task. During training, the adapter
modules are inserted into the pre-trained ViT model, and the adapters are
updated while other pre-trained parameters remain fixed. We find the
limitations of previous vanilla adapters in that they are based on linear
layers, which lack a spoofing-aware inductive bias and thus restrict the
cross-domain generalization. To address this limitation and achieve
cross-domain generalized FAS, we propose a novel Statistical Adapter
(S-Adapter) that gathers local discriminative and statistical information from
localized token histograms. To further improve the generalization of the
statistical tokens, we propose a novel Token Style Regularization (TSR), which
aims to reduce domain style variance by regularizing Gram matrices extracted
from tokens across different domains. Our experimental results demonstrate that
our proposed S-Adapter and TSR provide significant benefits in both zero-shot
and few-shot cross-domain testing, outperforming state-of-the-art methods on
several benchmark tests. We will release the source code upon acceptance
Deep Learning for Face Anti-Spoofing: A Survey
Face anti-spoofing (FAS) has lately attracted increasing attention due to its
vital role in securing face recognition systems from presentation attacks
(PAs). As more and more realistic PAs with novel types spring up, traditional
FAS methods based on handcrafted features become unreliable due to their
limited representation capacity. With the emergence of large-scale academic
datasets in the recent decade, deep learning based FAS achieves remarkable
performance and dominates this area. However, existing reviews in this field
mainly focus on the handcrafted features, which are outdated and uninspiring
for the progress of FAS community. In this paper, to stimulate future research,
we present the first comprehensive review of recent advances in deep learning
based FAS. It covers several novel and insightful components: 1) besides
supervision with binary label (e.g., '0' for bonafide vs. '1' for PAs), we also
investigate recent methods with pixel-wise supervision (e.g., pseudo depth
map); 2) in addition to traditional intra-dataset evaluation, we collect and
analyze the latest methods specially designed for domain generalization and
open-set FAS; and 3) besides commercial RGB camera, we summarize the deep
learning applications under multi-modal (e.g., depth and infrared) or
specialized (e.g., light field and flash) sensors. We conclude this survey by
emphasizing current open issues and highlighting potential prospects.Comment: IEEE Transactions on Pattern Analysis and Machine Intelligence
(TPAMI
FLIP: Cross-domain Face Anti-spoofing with Language Guidance
Face anti-spoofing (FAS) or presentation attack detection is an essential
component of face recognition systems deployed in security-critical
applications. Existing FAS methods have poor generalizability to unseen spoof
types, camera sensors, and environmental conditions. Recently, vision
transformer (ViT) models have been shown to be effective for the FAS task due
to their ability to capture long-range dependencies among image patches.
However, adaptive modules or auxiliary loss functions are often required to
adapt pre-trained ViT weights learned on large-scale datasets such as ImageNet.
In this work, we first show that initializing ViTs with multimodal (e.g., CLIP)
pre-trained weights improves generalizability for the FAS task, which is in
line with the zero-shot transfer capabilities of vision-language pre-trained
(VLP) models. We then propose a novel approach for robust cross-domain FAS by
grounding visual representations with the help of natural language.
Specifically, we show that aligning the image representation with an ensemble
of class descriptions (based on natural language semantics) improves FAS
generalizability in low-data regimes. Finally, we propose a multimodal
contrastive learning strategy to boost feature generalization further and
bridge the gap between source and target domains. Extensive experiments on
three standard protocols demonstrate that our method significantly outperforms
the state-of-the-art methods, achieving better zero-shot transfer performance
than five-shot transfer of adaptive ViTs. Code:
https://github.com/koushiksrivats/FLIPComment: Accepted to ICCV-2023. Project Page:
https://koushiksrivats.github.io/FLIP
Domain-Generalized Face Anti-Spoofing with Unknown Attacks
Although face anti-spoofing (FAS) methods have achieved remarkable
performance on specific domains or attack types, few studies have focused on
the simultaneous presence of domain changes and unknown attacks, which is
closer to real application scenarios. To handle domain-generalized unknown
attacks, we introduce a new method, DGUA-FAS, which consists of a
Transformer-based feature extractor and a synthetic unknown attack sample
generator (SUASG). The SUASG network simulates unknown attack samples to assist
the training of the feature extractor. Experimental results show that our
method achieves superior performance on domain generalization FAS with known or
unknown attacks.Comment: IEEE International Conference on Image Processing (ICIP 2023
EnfoMax: Domain Entropy and Mutual Information Maximization for Domain Generalized Face Anti-spoofing
The face anti-spoofing (FAS) method performs well under intra-domain setups.
However, its cross-domain performance is unsatisfactory. As a result, the
domain generalization (DG) method has gained more attention in FAS. Existing
methods treat FAS as a simple binary classification task and propose a
heuristic training objective to learn domain-invariant features. However, there
is no theoretical explanation of what a domain-invariant feature is.
Additionally, the lack of theoretical support makes domain generalization
techniques such as adversarial training lack training stability. To address
these issues, this paper proposes the EnfoMax framework, which uses information
theory to analyze cross-domain FAS tasks. This framework provides theoretical
guarantees and optimization objectives for domain-generalized FAS tasks.
EnfoMax maximizes the domain entropy and mutual information of live samples in
source domains without using adversarial learning. Experimental results
demonstrate that our approach performs well on extensive public datasets and
outperforms state-of-the-art methods
- …