64 research outputs found
Assessing Bias in Face Image Quality Assessment
Face image quality assessment (FIQA) attempts to improve face recognition
(FR) performance by providing additional information about sample quality.
Because FIQA methods attempt to estimate the utility of a sample for face
recognition, it is reasonable to assume that these methods are heavily
influenced by the underlying face recognition system. Although modern face
recognition systems are known to perform well, several studies have found that
such systems often exhibit problems with demographic bias. It is therefore
likely that such problems are also present with FIQA techniques. To investigate
the demographic biases associated with FIQA approaches, this paper presents a
comprehensive study involving a variety of quality assessment methods
(general-purpose image quality assessment, supervised face quality assessment,
and unsupervised face quality assessment methods) and three diverse
state-of-theart FR models. Our analysis on the Balanced Faces in the Wild (BFW)
dataset shows that all techniques considered are affected more by variations in
race than sex. While the general-purpose image quality assessment methods
appear to be less biased with respect to the two demographic factors
considered, the supervised and unsupervised face image quality assessment
methods both show strong bias with a tendency to favor white individuals (of
either sex). In addition, we found that methods that are less racially biased
perform worse overall. This suggests that the observed bias in FIQA methods is
to a significant extent related to the underlying face recognition system.Comment: The content of this paper was published in EUSIPCO 202
Face Morphing Attack Detection with Denoising Diffusion Probabilistic Models
Morphed face images have recently become a growing concern for existing face
verification systems, as they are relatively easy to generate and can be used
to impersonate someone's identity for various malicious purposes. Efficient
Morphing Attack Detection (MAD) that generalizes well across different morphing
techniques is, therefore, of paramount importance. Existing MAD techniques
predominantly rely on discriminative models that learn from examples of bona
fide and morphed images and, as a result, often exhibit sub-optimal
generalization performance when confronted with unknown types of morphing
attacks. To address this problem, we propose a novel, diffusion-based MAD
method in this paper that learns only from the characteristics of bona fide
images. Various forms of morphing attacks are then detected by our model as
out-of-distribution samples. We perform rigorous experiments over four
different datasets (CASIA-WebFace, FRLL-Morphs, FERET-Morphs and FRGC-Morphs)
and compare the proposed solution to both discriminatively-trained and
once-class MAD models. The experimental results show that our MAD model
achieves highly competitive results on all considered datasets.Comment: Published at IWBF 202
Influence of segmentation on deep iris recognition performance
Despite the rise of deep learning in numerous areas of computer vision and
image processing, iris recognition has not benefited considerably from these
trends so far. Most of the existing research on deep iris recognition is
focused on new models for generating discriminative and robust iris
representations and relies on methodologies akin to traditional iris
recognition pipelines. Hence, the proposed models do not approach iris
recognition in an end-to-end manner, but rather use standard heuristic iris
segmentation (and unwrapping) techniques to produce normalized inputs for the
deep learning models. However, because deep learning is able to model very
complex data distributions and nonlinear data changes, an obvious question
arises. How important is the use of traditional segmentation methods in a deep
learning setting? To answer this question, we present in this paper an
empirical analysis of the impact of iris segmentation on the performance of
deep learning models using a simple two stage pipeline consisting of a
segmentation and a recognition step. We evaluate how the accuracy of
segmentation influences recognition performance but also examine if
segmentation is needed at all. We use the CASIA Thousand and SBVPI datasets for
the experiments and report several interesting findings.Comment: 6 pages, 3 figures, 3 tables, submitted to IWBF 201
On the Vulnerability of DeepFake Detectors to Attacks Generated by Denoising Diffusion Models
The detection of malicious Deepfakes is a constantly evolving problem, that
requires continuous monitoring of detectors, to ensure they are able to detect
image manipulations generated by the latest emerging models. In this paper, we
present a preliminary study that investigates the vulnerability of single-image
Deepfake detectors to attacks created by a representative of the newest
generation of generative methods, i.e. Denoising Diffusion Models (DDMs). Our
experiments are run on FaceForensics++, a commonly used benchmark dataset,
consisting of Deepfakes generated with various techniques for face swapping and
face reenactment. The analysis shows, that reconstructing existing Deepfakes
with only one denoising diffusion step significantly decreases the accuracy of
all tested detectors, without introducing visually perceptible image changes.Comment: Submitted for revie
High Resolution Face Editing with Masked GAN Latent Code Optimization
Face editing represents a popular research topic within the computer vision
and image processing communities. While significant progress has been made
recently in this area, existing solutions: (i) are still largely focused on
low-resolution images, (ii) often generate editing results with visual
artefacts, or (iii) lack fine-grained control and alter multiple (entangled)
attributes at once, when trying to generate the desired facial semantics. In
this paper, we aim to address these issues though a novel attribute editing
approach called MaskFaceGAN. The proposed approach is based on an optimization
procedure that directly optimizes the latent code of a pre-trained
(state-of-the-art) Generative Adversarial Network (i.e., StyleGAN2) with
respect to several constraints that ensure: (i) preservation of relevant image
content, (ii) generation of the targeted facial attributes, and (iii)
spatially--selective treatment of local image areas. The constraints are
enforced with the help of an (differentiable) attribute classifier and face
parser that provide the necessary reference information for the optimization
procedure. MaskFaceGAN is evaluated in extensive experiments on the CelebA-HQ,
Helen and SiblingsDB-HQf datasets and in comparison with several
state-of-the-art techniques from the literature, i.e., StarGAN, AttGAN, STGAN,
and two versions of InterFaceGAN. Our experimental results show that the
proposed approach is able to edit face images with respect to several facial
attributes with unprecedented image quality and at high-resolutions
(1024x1024), while exhibiting considerably less problems with attribute
entanglement than competing solutions. The source code is made freely available
from: https://github.com/MartinPernus/MaskFaceGAN.Comment: The updated paper will be submitted to IEEE Transactions on Image
Processing. Added more qualitative and quantitative results to the main part
of the paper. This version now also includes the supplementary materia
Optimization-Based Improvement of Face Image Quality Assessment Techniques
Contemporary face recognition (FR) models achieve near-ideal recognition
performance in constrained settings, yet do not fully translate the performance
to unconstrained (realworld) scenarios. To help improve the performance and
stability of FR systems in such unconstrained settings, face image quality
assessment (FIQA) techniques try to infer sample-quality information from the
input face images that can aid with the recognition process. While existing
FIQA techniques are able to efficiently capture the differences between high
and low quality images, they typically cannot fully distinguish between images
of similar quality, leading to lower performance in many scenarios. To address
this issue, we present in this paper a supervised quality-label optimization
approach, aimed at improving the performance of existing FIQA techniques. The
developed optimization procedure infuses additional information (computed with
a selected FR model) into the initial quality scores generated with a given
FIQA technique to produce better estimates of the "actual" image quality. We
evaluate the proposed approach in comprehensive experiments with six
state-of-the-art FIQA approaches (CR-FIQA, FaceQAN, SER-FIQ, PCNet, MagFace,
SDD-FIQA) on five commonly used benchmarks (LFW, CFPFP, CPLFW, CALFW, XQLFW)
using three targeted FR models (ArcFace, ElasticFace, CurricularFace) with
highly encouraging results.Comment: In proceedings of the International Workshop on Biometrics and
Forensics (IWBF) 202
BiOcularGAN: Bimodal Synthesis and Annotation of Ocular Images
Current state-of-the-art segmentation techniques for ocular images are
critically dependent on large-scale annotated datasets, which are
labor-intensive to gather and often raise privacy concerns. In this paper, we
present a novel framework, called BiOcularGAN, capable of generating synthetic
large-scale datasets of photorealistic (visible light and near-infrared) ocular
images, together with corresponding segmentation labels to address these
issues. At its core, the framework relies on a novel Dual-Branch StyleGAN2
(DB-StyleGAN2) model that facilitates bimodal image generation, and a Semantic
Mask Generator (SMG) component that produces semantic annotations by exploiting
latent features of the DB-StyleGAN2 model. We evaluate BiOcularGAN through
extensive experiments across five diverse ocular datasets and analyze the
effects of bimodal data generation on image quality and the produced
annotations. Our experimental results show that BiOcularGAN is able to produce
high-quality matching bimodal images and annotations (with minimal manual
intervention) that can be used to train highly competitive (deep) segmentation
models (in a privacy aware-manner) that perform well across multiple real-world
datasets. The source code for the BiOcularGAN framework is publicly available
at https://github.com/dariant/BiOcularGAN.Comment: 13 pages, 14 figure
Beyond Detection: Visual Realism Assessment of Deepfakes
In the era of rapid digitalization and artificial intelligence advancements,
the development of DeepFake technology has posed significant security and
privacy concerns. This paper presents an effective measure to assess the visual
realism of DeepFake videos. We utilize an ensemble of two Convolutional Neural
Network (CNN) models: Eva and ConvNext. These models have been trained on the
DeepFake Game Competition (DFGC) 2022 dataset and aim to predict Mean Opinion
Scores (MOS) from DeepFake videos based on features extracted from sequences of
frames. Our method secured the third place in the recent DFGC on Visual Realism
Assessment held in conjunction with the 2023 International Joint Conference on
Biometrics (IJCB 2023). We provide an over\-view of the models, data
preprocessing, and training procedures. We also report the performance of our
models against the competition's baseline model and discuss the implications of
our findings
Body Segmentation Using Multi-task Learning
Body segmentation is an important step in many computer vision problems
involving human images and one of the key components that affects the
performance of all downstream tasks. Several prior works have approached this
problem using a multi-task model that exploits correlations between different
tasks to improve segmentation performance. Based on the success of such
solutions, we present in this paper a novel multi-task model for human
segmentation/parsing that involves three tasks, i.e., (i) keypoint-based
skeleton estimation, (ii) dense pose prediction, and (iii) human-body
segmentation. The main idea behind the proposed Segmentation--Pose--DensePose
model (or SPD for short) is to learn a better segmentation model by sharing
knowledge across different, yet related tasks. SPD is based on a shared deep
neural network backbone that branches off into three task-specific model heads
and is learned using a multi-task optimization objective. The performance of
the model is analysed through rigorous experiments on the LIP and ATR datasets
and in comparison to a recent (state-of-the-art) multi-task body-segmentation
model. Comprehensive ablation studies are also presented. Our experimental
results show that the proposed multi-task (segmentation) model is highly
competitive and that the introduction of additional tasks contributes towards a
higher overall segmentation performance
- …