67 research outputs found
Periocular in the Wild Embedding Learning with Cross-Modal Consistent Knowledge Distillation
Periocular biometric, or peripheral area of ocular, is a collaborative
alternative to face, especially if a face is occluded or masked. In practice,
sole periocular biometric captures least salient facial features, thereby
suffering from intra-class compactness and inter-class dispersion issues
particularly in the wild environment. To address these problems, we transfer
useful information from face to support periocular modality by means of
knowledge distillation (KD) for embedding learning. However, applying typical
KD techniques to heterogeneous modalities directly is suboptimal. We put
forward in this paper a deep face-to-periocular distillation networks, coined
as cross-modal consistent knowledge distillation (CM-CKD) henceforward. The
three key ingredients of CM-CKD are (1) shared-weight networks, (2) consistent
batch normalization, and (3) a bidirectional consistency distillation for face
and periocular through an effectual CKD loss. To be more specific, we leverage
face modality for periocular embedding learning, but only periocular images are
targeted for identification or verification tasks. Extensive experiments on six
constrained and unconstrained periocular datasets disclose that the
CM-CKD-learned periocular embeddings extend identification and verification
performance by 50% in terms of relative performance gain computed based upon
face and periocular baselines. The experiments also reveal that the
CM-CKD-learned periocular features enjoy better subject-wise cluster
separation, thereby refining the overall accuracy performance.Comment: 30 page
A Reminiscence of ”Mastermind”: Iris/Periocular Biometrics by ”In-Set” CNN Iterative Analysis
Convolutional neural networks (CNNs) have
emerged as the most popular classification models in biometrics
research. Under the discriminative paradigm of pattern
recognition, CNNs are used typically in one of two ways: 1)
verification mode (”are samples from the same person?”), where
pairs of images are provided to the network to distinguish
between genuine and impostor instances; and 2) identification
mode (”whom is this sample from?”), where appropriate feature
representations that map images to identities are found. This
paper postulates a novel mode for using CNNs in biometric
identification, by learning models that answer to the question ”is
the query’s identity among this set?”. The insight is a reminiscence
of the classical Mastermind game: by iteratively analysing the
network responses when multiple random samples of k gallery
elements are compared to the query, we obtain weakly correlated
matching scores that - altogether - provide solid cues to infer
the most likely identity. In this setting, identification is regarded
as a variable selection and regularization problem, with sparse
linear regression techniques being used to infer the matching
probability with respect to each gallery identity. As main strength,
this strategy is highly robust to outlier matching scores, which
are known to be a primary error source in biometric recognition.
Our experiments were carried out in full versions of two
well known irises near-infrared (CASIA-IrisV4-Thousand) and
periocular visible wavelength (UBIRIS.v2) datasets, and confirm
that recognition performance can be solidly boosted-up by the
proposed algorithm, when compared to the traditional working
modes of CNNs in biometrics.info:eu-repo/semantics/publishedVersio
UFPR-Periocular: A Periocular Dataset Collected by Mobile Devices in Unconstrained Scenarios
Recently, ocular biometrics in unconstrained environments using images
obtained at visible wavelength have gained the researchers' attention,
especially with images captured by mobile devices. Periocular recognition has
been demonstrated to be an alternative when the iris trait is not available due
to occlusions or low image resolution. However, the periocular trait does not
have the high uniqueness presented in the iris trait. Thus, the use of datasets
containing many subjects is essential to assess biometric systems' capacity to
extract discriminating information from the periocular region. Also, to address
the within-class variability caused by lighting and attributes in the
periocular region, it is of paramount importance to use datasets with images of
the same subject captured in distinct sessions. As the datasets available in
the literature do not present all these factors, in this work, we present a new
periocular dataset containing samples from 1,122 subjects, acquired in 3
sessions by 196 different mobile devices. The images were captured under
unconstrained environments with just a single instruction to the participants:
to place their eyes on a region of interest. We also performed an extensive
benchmark with several Convolutional Neural Network (CNN) architectures and
models that have been employed in state-of-the-art approaches based on
Multi-class Classification, Multitask Learning, Pairwise Filters Network, and
Siamese Network. The results achieved in the closed- and open-world protocol,
considering the identification and verification tasks, show that this area
still needs research and development
Deep Adversarial Frameworks for Visually Explainable Periocular Recognition
Machine Learning (ML) models have pushed stateÂofÂtheÂart performance closer to (and
even beyond) human level. However, the core of such algorithms is usually latent and
hardly understandable. Thus, the field of Explainability focuses on researching and adopting techniques that can explain the reasons that support a model’s predictions. Such explanations of the decisionÂmaking process would help to build trust between said model
and the human(s) using it. An explainable system also allows for better debugging, during
the training phase, and fixing, upon deployment. But why should a developer devote time
and effort into refactoring or rethinking Artificial Intelligence (AI) systems, to make them
more transparent? Don’t they work just fine?
Despite the temptation to answer ”yes”, are we really considering the cases where these
systems fail? Are we assuming that ”almost perfect” accuracy is good enough? What if,
some of the cases where these systems get it right, were just a small margin away from
a complete miss? Does that even matter? Considering the everÂgrowing presence of ML
models in crucial areas like forensics, security and healthcare services, it clearly does.
Motivating these concerns is the fact that powerful systems often operate as blackÂboxes,
hiding the core reasoning underneath layers of abstraction [Gue]. In this scenario, there
could be some seriously negative outcomes if opaque algorithms gamble on the presence
of tumours in XÂray images or the way autonomous vehicles behave in traffic.
It becomes clear, then, that incorporating explainability with AI is imperative. More recently, the politicians have addressed this urgency through the General Data Protection
Regulation (GDPR) [Com18]. With this document, the European Union (EU) brings forward several important concepts, amongst which, the ”right to an explanation”. The definition and scope are still subject to debate [MF17], but these are definite strides to formally
regulate the explainable depth of autonomous systems.
Based on the preface above, this work describes a periocular recognition framework that
not only performs biometric recognition but also provides clear representations of the features/regions that support a prediction. Being particularly designed to explain nonÂmatch
(”impostors”) decisions, our solution uses adversarial generative techniques to synthesise
a large set of ”genuine” image pairs, from where the most similar elements with respect to
a query are retrieved. Then, assuming the alignment between the query/retrieved pairs,
the elementÂwise differences between the query and a weighted average of the retrieved
elements yields a visual explanation of the regions in the query pair that would have to
be different to transform it into a ”genuine” pair. Our quantitative and qualitative experiments validate the proposed solution, yielding recognition rates that are similar to the
stateÂofÂtheÂart, while adding visually pleasing explanations
Advanced Biometrics with Deep Learning
Biometrics, such as fingerprint, iris, face, hand print, hand vein, speech and gait recognition, etc., as a means of identity management have become commonplace nowadays for various applications. Biometric systems follow a typical pipeline, that is composed of separate preprocessing, feature extraction and classification. Deep learning as a data-driven representation learning approach has been shown to be a promising alternative to conventional data-agnostic and handcrafted pre-processing and feature extraction for biometric systems. Furthermore, deep learning offers an end-to-end learning paradigm to unify preprocessing, feature extraction, and recognition, based solely on biometric data. This Special Issue has collected 12 high-quality, state-of-the-art research papers that deal with challenging issues in advanced biometric systems based on deep learning. The 12 papers can be divided into 4 categories according to biometric modality; namely, face biometrics, medical electronic signals (EEG and ECG), voice print, and others
High-Fidelity Eye Animatable Neural Radiance Fields for Human Face
Face rendering using neural radiance fields (NeRF) is a rapidly developing
research area in computer vision. While recent methods primarily focus on
controlling facial attributes such as identity and expression, they often
overlook the crucial aspect of modeling eyeball rotation, which holds
importance for various downstream tasks. In this paper, we aim to learn a face
NeRF model that is sensitive to eye movements from multi-view images. We
address two key challenges in eye-aware face NeRF learning: how to effectively
capture eyeball rotation for training and how to construct a manifold for
representing eyeball rotation. To accomplish this, we first fit FLAME, a
well-established parametric face model, to the multi-view images considering
multi-view consistency. Subsequently, we introduce a new Dynamic Eye-aware NeRF
(DeNeRF). DeNeRF transforms 3D points from different views into a canonical
space to learn a unified face NeRF model. We design an eye deformation field
for the transformation, including rigid transformation, e.g., eyeball rotation,
and non-rigid transformation. Through experiments conducted on the ETH-XGaze
dataset, we demonstrate that our model is capable of generating high-fidelity
images with accurate eyeball rotation and non-rigid periocular deformation,
even under novel viewing angles. Furthermore, we show that utilizing the
rendered images can effectively enhance gaze estimation performance.Comment: Under revie
- …