286 research outputs found
Learning Meta Model for Zero- and Few-shot Face Anti-spoofing
Face anti-spoofing is crucial to the security of face recognition systems.
Most previous methods formulate face anti-spoofing as a supervised learning
problem to detect various predefined presentation attacks, which need large
scale training data to cover as many attacks as possible. However, the trained
model is easy to overfit several common attacks and is still vulnerable to
unseen attacks. To overcome this challenge, the detector should: 1) learn
discriminative features that can generalize to unseen spoofing types from
predefined presentation attacks; 2) quickly adapt to new spoofing types by
learning from both the predefined attacks and a few examples of the new
spoofing types. Therefore, we define face anti-spoofing as a zero- and few-shot
learning problem. In this paper, we propose a novel Adaptive Inner-update Meta
Face Anti-Spoofing (AIM-FAS) method to tackle this problem through
meta-learning. Specifically, AIM-FAS trains a meta-learner focusing on the task
of detecting unseen spoofing types by learning from predefined living and
spoofing faces and a few examples of new attacks. To assess the proposed
approach, we propose several benchmarks for zero- and few-shot FAS. Experiments
show its superior performances on the presented benchmarks to existing methods
in existing zero-shot FAS protocols.Comment: Accepted by AAAI202
Regularized Fine-grained Meta Face Anti-spoofing
Face presentation attacks have become an increasingly critical concern when
face recognition is widely applied. Many face anti-spoofing methods have been
proposed, but most of them ignore the generalization ability to unseen attacks.
To overcome the limitation, this work casts face anti-spoofing as a domain
generalization (DG) problem, and attempts to address this problem by developing
a new meta-learning framework called Regularized Fine-grained Meta-learning. To
let our face anti-spoofing model generalize well to unseen attacks, the
proposed framework trains our model to perform well in the simulated domain
shift scenarios, which is achieved by finding generalized learning directions
in the meta-learning process. Specifically, the proposed framework incorporates
the domain knowledge of face anti-spoofing as the regularization so that
meta-learning is conducted in the feature space regularized by the supervision
of domain knowledge. This enables our model more likely to find generalized
learning directions with the regularized meta-learning for face anti-spoofing
task. Besides, to further enhance the generalization ability of our model, the
proposed framework adopts a fine-grained learning strategy that simultaneously
conducts meta-learning in a variety of domain shift scenarios in each
iteration. Extensive experiments on four public datasets validate the
effectiveness of the proposed method.Comment: Accepted by AAAI 2020. Codes are available at
https://github.com/rshaojimmy/AAAI2020-RFMetaFA
Hyperbolic Face Anti-Spoofing
Learning generalized face anti-spoofing (FAS) models against presentation
attacks is essential for the security of face recognition systems. Previous FAS
methods usually encourage models to extract discriminative features, of which
the distances within the same class (bonafide or attack) are pushed close while
those between bonafide and attack are pulled away. However, these methods are
designed based on Euclidean distance, which lacks generalization ability for
unseen attack detection due to poor hierarchy embedding ability. According to
the evidence that different spoofing attacks are intrinsically hierarchical, we
propose to learn richer hierarchical and discriminative spoofing cues in
hyperbolic space. Specifically, for unimodal FAS learning, the feature
embeddings are projected into the Poincar\'e ball, and then the hyperbolic
binary logistic regression layer is cascaded for classification. To further
improve generalization, we conduct hyperbolic contrastive learning for the
bonafide only while relaxing the constraints on diverse spoofing attacks. To
alleviate the vanishing gradient problem in hyperbolic space, a new feature
clipping method is proposed to enhance the training stability of hyperbolic
models. Besides, we further design a multimodal FAS framework with Euclidean
multimodal feature decomposition and hyperbolic multimodal feature fusion &
classification. Extensive experiments on three benchmark datasets (i.e., WMCA,
PADISI-Face, and SiW-M) with diverse attack types demonstrate that the proposed
method can bring significant improvement compared to the Euclidean baselines on
unseen attack detection. In addition, the proposed framework is also
generalized well on four benchmark datasets (i.e., MSU-MFSD, IDIAP
REPLAY-ATTACK, CASIA-FASD, and OULU-NPU) with a limited number of attack types
Domain Generalization via Ensemble Stacking for Face Presentation Attack Detection
Face Presentation Attack Detection (PAD) plays a pivotal role in securing
face recognition systems against spoofing attacks. Although great progress has
been made in designing face PAD methods, developing a model that can generalize
well to unseen test domains remains a significant challenge. Moreover, due to
different types of spoofing attacks, creating a dataset with a sufficient
number of samples for training deep neural networks is a laborious task. This
work proposes a comprehensive solution that combines synthetic data generation
and deep ensemble learning to enhance the generalization capabilities of face
PAD. Specifically, synthetic data is generated by blending a static image with
spatiotemporal encoded images using alpha composition and video distillation.
This way, we simulate motion blur with varying alpha values, thereby generating
diverse subsets of synthetic data that contribute to a more enriched training
set. Furthermore, multiple base models are trained on each subset of synthetic
data using stacked ensemble learning. This allows the models to learn
complementary features and representations from different synthetic subsets.
The meta-features generated by the base models are used as input to a new model
called the meta-model. The latter combines the predictions from the base
models, leveraging their complementary information to better handle unseen
target domains and enhance the overall performance. Experimental results on
four datasets demonstrate low half total error rates (HTERs) on three benchmark
datasets: CASIA-MFSD (8.92%), MSU-MFSD (4.81%), and OULU-NPU (6.70%). The
approach shows potential for advancing presentation attack detection by
utilizing large-scale synthetic data and the meta-model
FedSIS: Federated Split Learning with Intermediate Representation Sampling for Privacy-preserving Generalized Face Presentation Attack Detection
Lack of generalization to unseen domains/attacks is the Achilles heel of most
face presentation attack detection (FacePAD) algorithms. Existing attempts to
enhance the generalizability of FacePAD solutions assume that data from
multiple source domains are available with a single entity to enable
centralized training. In practice, data from different source domains may be
collected by diverse entities, who are often unable to share their data due to
legal and privacy constraints. While collaborative learning paradigms such as
federated learning (FL) can overcome this problem, standard FL methods are
ill-suited for domain generalization because they struggle to surmount the twin
challenges of handling non-iid client data distributions during training and
generalizing to unseen domains during inference. In this work, a novel
framework called Federated Split learning with Intermediate representation
Sampling (FedSIS) is introduced for privacy-preserving domain generalization.
In FedSIS, a hybrid Vision Transformer (ViT) architecture is learned using a
combination of FL and split learning to achieve robustness against statistical
heterogeneity in the client data distributions without any sharing of raw data
(thereby preserving privacy). To further improve generalization to unseen
domains, a novel feature augmentation strategy called intermediate
representation sampling is employed, and discriminative information from
intermediate blocks of a ViT is distilled using a shared adapter network. The
FedSIS approach has been evaluated on two well-known benchmarks for
cross-domain FacePAD to demonstrate that it is possible to achieve
state-of-the-art generalization performance without data sharing. Code:
https://github.com/Naiftt/FedSISComment: Accepted to the IEEE International Joint Conference on Biometrics
(IJCB), 202
Deep Learning for Face Anti-Spoofing: A Survey
Face anti-spoofing (FAS) has lately attracted increasing attention due to its
vital role in securing face recognition systems from presentation attacks
(PAs). As more and more realistic PAs with novel types spring up, traditional
FAS methods based on handcrafted features become unreliable due to their
limited representation capacity. With the emergence of large-scale academic
datasets in the recent decade, deep learning based FAS achieves remarkable
performance and dominates this area. However, existing reviews in this field
mainly focus on the handcrafted features, which are outdated and uninspiring
for the progress of FAS community. In this paper, to stimulate future research,
we present the first comprehensive review of recent advances in deep learning
based FAS. It covers several novel and insightful components: 1) besides
supervision with binary label (e.g., '0' for bonafide vs. '1' for PAs), we also
investigate recent methods with pixel-wise supervision (e.g., pseudo depth
map); 2) in addition to traditional intra-dataset evaluation, we collect and
analyze the latest methods specially designed for domain generalization and
open-set FAS; and 3) besides commercial RGB camera, we summarize the deep
learning applications under multi-modal (e.g., depth and infrared) or
specialized (e.g., light field and flash) sensors. We conclude this survey by
emphasizing current open issues and highlighting potential prospects.Comment: IEEE Transactions on Pattern Analysis and Machine Intelligence
(TPAMI
Explainable and Interpretable Face Presentation Attack Detection Methods
Decision support systems based on machine learning (ML) techniques are excelling in most artificial intelligence (AI) fields, over-performing other AI methods, as well as humans. However, challenges still exist that do not favour the dominance of AI in some applications. This proposal focuses on a critical one: lack of transparency and explainability, reducing trust and accountability of an AI system. The fact that most AI methods still operate as complex black boxes, makes the inner processes which sustain their predictions still unattainable. The awareness around these observations foster the need to regulate many sensitive domains where AI has been applied in order to interpret, explain and audit the reliability of the ML based systems.
Although modern-day biometric recognition (BR) systems are already benefiting from the performance gains achieved with AI (which can account for and learn subtle changes in the person to be authenticated or statistical mismatches between samples), it is still in the dark ages of black box models, without reaping the benefits of the mismatches between samples), it is still in the dark ages of black box models, without reaping the benefits of the XAI field. This work will focus on studying AI explainability in the field of biometrics focusing in particular use cases in BR, such as verification/ identification of individuals and liveness detection (LD) (aka, antispoofing).
The main goals of this work are: i) to become acquainted with the state-of-the-art in explainability and biometric recognition and PAD methods; ii) to develop an experimental work xxxxx
Tasks 1st semester
(1) Study of the state of the art- bibliography review on state of the art for presentation attack detection
(2) Get acquainted with the previous work of the group in the topic
(3) Data preparation and data pre-processing
(3) Define the experimental protocol, including performance metrics
(4) Perform baseline experiments
(5) Write monography
Tasks 2nd semester
(1) Update on the state of the art
(2) Data preparation and data pre-processing
(3) Propose and implement a methodology for interpretability in biometrics
(4) Evaluation of the performance and comparison with baseline and state of the art approaches
(5) Dissertation writing
Referências bibliográficas principais: (*)
[Doshi17] B. Kim and F. Doshi-Velez, "Interpretable machine learning: The fuss, the concrete and the questions," 2017
[Mol19] Christoph Molnar. Interpretable Machine Learning. 2019
[Sei18] C. Seibold, W. Samek, A. Hilsmann, and P. Eisert, "Accurate and robust neural networks for security related applications exampled by face morphing attacks," arXiv preprint arXiv:1806.04265, 2018
[Seq20] Sequeira, Ana F., João T. Pinto, Wilson Silva, Tiago Gonçalves and Cardoso, Jaime S., "Interpretable Biometrics: Should We Rethink How Presentation Attack Detection is Evaluated?", 8th IWBF2020
[Wilson18] W. Silva, K. Fernandes, M. J. Cardoso, and J. S. Cardoso, "Towards complementary explanations using deep neural networks," in Understanding and Interpreting Machine Learning in MICA. Springer, 2018
[Wilson19] W. Silva, K. Fernandes, and J. S. Cardoso, "How to produce complementary explanations using an Ensemble Model," in IJCNN. 2019
[Wilson19A] W. Silva, M. J. Cardoso, and J. S. Cardoso, "Image captioning as a proxy for Explainable Decisions" in Understanding and Interpreting Machine Learning in MICA, 2019 (Submitted
- …