239 research outputs found
Semi-Supervised learning for Face Anti-Spoofing using Apex frame
Conventional feature extraction techniques in the face anti-spoofing domain
either analyze the entire video sequence or focus on a specific segment to
improve model performance. However, identifying the optimal frames that provide
the most valuable input for the face anti-spoofing remains a challenging task.
In this paper, we address this challenge by employing Gaussian weighting to
create apex frames for videos. Specifically, an apex frame is derived from a
video by computing a weighted sum of its frames, where the weights are
determined using a Gaussian distribution centered around the video's central
frame. Furthermore, we explore various temporal lengths to produce multiple
unlabeled apex frames using a Gaussian function, without the need for
convolution. By doing so, we leverage the benefits of semi-supervised learning,
which considers both labeled and unlabeled apex frames to effectively
discriminate between live and spoof classes. Our key contribution emphasizes
the apex frame's capacity to represent the most significant moments in the
video, while unlabeled apex frames facilitate efficient semi-supervised
learning, as they enable the model to learn from videos of varying temporal
lengths. Experimental results using four face anti-spoofing databases: CASIA,
REPLAY-ATTACK, OULU-NPU, and MSU-MFSD demonstrate the apex frame's efficacy in
advancing face anti-spoofing techniques
Deep Learning for Face Anti-Spoofing: A Survey
Face anti-spoofing (FAS) has lately attracted increasing attention due to its
vital role in securing face recognition systems from presentation attacks
(PAs). As more and more realistic PAs with novel types spring up, traditional
FAS methods based on handcrafted features become unreliable due to their
limited representation capacity. With the emergence of large-scale academic
datasets in the recent decade, deep learning based FAS achieves remarkable
performance and dominates this area. However, existing reviews in this field
mainly focus on the handcrafted features, which are outdated and uninspiring
for the progress of FAS community. In this paper, to stimulate future research,
we present the first comprehensive review of recent advances in deep learning
based FAS. It covers several novel and insightful components: 1) besides
supervision with binary label (e.g., '0' for bonafide vs. '1' for PAs), we also
investigate recent methods with pixel-wise supervision (e.g., pseudo depth
map); 2) in addition to traditional intra-dataset evaluation, we collect and
analyze the latest methods specially designed for domain generalization and
open-set FAS; and 3) besides commercial RGB camera, we summarize the deep
learning applications under multi-modal (e.g., depth and infrared) or
specialized (e.g., light field and flash) sensors. We conclude this survey by
emphasizing current open issues and highlighting potential prospects.Comment: IEEE Transactions on Pattern Analysis and Machine Intelligence
(TPAMI
Domain Generalization in Vision: A Survey
Generalization to out-of-distribution (OOD) data is a capability natural to
humans yet challenging for machines to reproduce. This is because most learning
algorithms strongly rely on the i.i.d.~assumption on source/target data, which
is often violated in practice due to domain shift. Domain generalization (DG)
aims to achieve OOD generalization by using only source data for model
learning. Since first introduced in 2011, research in DG has made great
progresses. In particular, intensive research in this topic has led to a broad
spectrum of methodologies, e.g., those based on domain alignment,
meta-learning, data augmentation, or ensemble learning, just to name a few; and
has covered various vision applications such as object recognition,
segmentation, action recognition, and person re-identification. In this paper,
for the first time a comprehensive literature review is provided to summarize
the developments in DG for computer vision over the past decade. Specifically,
we first cover the background by formally defining DG and relating it to other
research fields like domain adaptation and transfer learning. Second, we
conduct a thorough review into existing methods and present a categorization
based on their methodologies and motivations. Finally, we conclude this survey
with insights and discussions on future research directions.Comment: v4: includes the word "vision" in the title; improves the
organization and clarity in Section 2-3; adds future directions; and mor
The DKU-OPPO System for the 2022 Spoofing-Aware Speaker Verification Challenge
This paper describes our DKU-OPPO system for the 2022 Spoofing-Aware Speaker
Verification (SASV) Challenge. First, we split the joint task into speaker
verification (SV) and spoofing countermeasure (CM), these two tasks which are
optimized separately. For ASV systems, four state-of-the-art methods are
employed. For CM systems, we propose two methods on top of the challenge
baseline to further improve the performance, namely Embedding Random Sampling
Augmentation (ERSA) and One-Class Confusion Loss(OCCL). Second, we also explore
whether SV embedding could help improve CM system performance. We observe a
dramatic performance degradation of existing CM systems on the
domain-mismatched Voxceleb2 dataset. Third, we compare different fusion
strategies, including parallel score fusion and sequential cascaded systems.
Compared to the 1.71% SASV-EER baseline, our submitted cascaded system obtains
a 0.21% SASV-EER on the challenge official evaluation set.Comment: Accepted by Interspeech202
Explainable and Interpretable Face Presentation Attack Detection Methods
Decision support systems based on machine learning (ML) techniques are excelling in most artificial intelligence (AI) fields, over-performing other AI methods, as well as humans. However, challenges still exist that do not favour the dominance of AI in some applications. This proposal focuses on a critical one: lack of transparency and explainability, reducing trust and accountability of an AI system. The fact that most AI methods still operate as complex black boxes, makes the inner processes which sustain their predictions still unattainable. The awareness around these observations foster the need to regulate many sensitive domains where AI has been applied in order to interpret, explain and audit the reliability of the ML based systems.
Although modern-day biometric recognition (BR) systems are already benefiting from the performance gains achieved with AI (which can account for and learn subtle changes in the person to be authenticated or statistical mismatches between samples), it is still in the dark ages of black box models, without reaping the benefits of the mismatches between samples), it is still in the dark ages of black box models, without reaping the benefits of the XAI field. This work will focus on studying AI explainability in the field of biometrics focusing in particular use cases in BR, such as verification/ identification of individuals and liveness detection (LD) (aka, antispoofing).
The main goals of this work are: i) to become acquainted with the state-of-the-art in explainability and biometric recognition and PAD methods; ii) to develop an experimental work xxxxx
Tasks 1st semester
(1) Study of the state of the art- bibliography review on state of the art for presentation attack detection
(2) Get acquainted with the previous work of the group in the topic
(3) Data preparation and data pre-processing
(3) Define the experimental protocol, including performance metrics
(4) Perform baseline experiments
(5) Write monography
Tasks 2nd semester
(1) Update on the state of the art
(2) Data preparation and data pre-processing
(3) Propose and implement a methodology for interpretability in biometrics
(4) Evaluation of the performance and comparison with baseline and state of the art approaches
(5) Dissertation writing
Referências bibliográficas principais: (*)
[Doshi17] B. Kim and F. Doshi-Velez, "Interpretable machine learning: The fuss, the concrete and the questions," 2017
[Mol19] Christoph Molnar. Interpretable Machine Learning. 2019
[Sei18] C. Seibold, W. Samek, A. Hilsmann, and P. Eisert, "Accurate and robust neural networks for security related applications exampled by face morphing attacks," arXiv preprint arXiv:1806.04265, 2018
[Seq20] Sequeira, Ana F., João T. Pinto, Wilson Silva, Tiago Gonçalves and Cardoso, Jaime S., "Interpretable Biometrics: Should We Rethink How Presentation Attack Detection is Evaluated?", 8th IWBF2020
[Wilson18] W. Silva, K. Fernandes, M. J. Cardoso, and J. S. Cardoso, "Towards complementary explanations using deep neural networks," in Understanding and Interpreting Machine Learning in MICA. Springer, 2018
[Wilson19] W. Silva, K. Fernandes, and J. S. Cardoso, "How to produce complementary explanations using an Ensemble Model," in IJCNN. 2019
[Wilson19A] W. Silva, M. J. Cardoso, and J. S. Cardoso, "Image captioning as a proxy for Explainable Decisions" in Understanding and Interpreting Machine Learning in MICA, 2019 (Submitted
- …