101 research outputs found
Log-Likelihood Score Level Fusion for Improved Cross-Sensor Smartphone Periocular Recognition
The proliferation of cameras and personal devices results in a wide
variability of imaging conditions, producing large intra-class variations and a
significant performance drop when images from heterogeneous environments are
compared. However, many applications require to deal with data from different
sources regularly, thus needing to overcome these interoperability problems.
Here, we employ fusion of several comparators to improve periocular performance
when images from different smartphones are compared. We use a probabilistic
fusion framework based on linear logistic regression, in which fused scores
tend to be log-likelihood ratios, obtaining a reduction in cross-sensor EER of
up to 40% due to the fusion. Our framework also provides an elegant and simple
solution to handle signals from different devices, since same-sensor and
cross-sensor score distributions are aligned and mapped to a common
probabilistic domain. This allows the use of Bayes thresholds for optimal
decision-making, eliminating the need of sensor-specific thresholds, which is
essential in operational conditions because the threshold setting critically
determines the accuracy of the authentication process in many applications.Comment: Published at Proc. 25th European Signal Processing Conference,
EUSIPCO 2017. arXiv admin note: text overlap with arXiv:1902.0812
UFPR-Periocular: A Periocular Dataset Collected by Mobile Devices in Unconstrained Scenarios
Recently, ocular biometrics in unconstrained environments using images
obtained at visible wavelength have gained the researchers' attention,
especially with images captured by mobile devices. Periocular recognition has
been demonstrated to be an alternative when the iris trait is not available due
to occlusions or low image resolution. However, the periocular trait does not
have the high uniqueness presented in the iris trait. Thus, the use of datasets
containing many subjects is essential to assess biometric systems' capacity to
extract discriminating information from the periocular region. Also, to address
the within-class variability caused by lighting and attributes in the
periocular region, it is of paramount importance to use datasets with images of
the same subject captured in distinct sessions. As the datasets available in
the literature do not present all these factors, in this work, we present a new
periocular dataset containing samples from 1,122 subjects, acquired in 3
sessions by 196 different mobile devices. The images were captured under
unconstrained environments with just a single instruction to the participants:
to place their eyes on a region of interest. We also performed an extensive
benchmark with several Convolutional Neural Network (CNN) architectures and
models that have been employed in state-of-the-art approaches based on
Multi-class Classification, Multitask Learning, Pairwise Filters Network, and
Siamese Network. The results achieved in the closed- and open-world protocol,
considering the identification and verification tasks, show that this area
still needs research and development
Learning Multimodal Structures in Computer Vision
A phenomenon or event can be received from various kinds of detectors or under different conditions. Each such acquisition framework is a modality of the phenomenon. Due to the relation between the modalities of multimodal phenomena, a single modality cannot fully describe the event of interest. Since several modalities report on the same event introduces new challenges comparing to the case of exploiting each modality separately.
We are interested in designing new algorithmic tools to apply sensor fusion techniques in the particular signal representation of sparse coding which is a favorite methodology in signal processing, machine learning and statistics to represent data. This coding scheme is based on a machine learning technique and has been demonstrated to be capable of representing many modalities like natural images. We will consider situations where we are not only interested in support of the model to be sparse, but also to reflect a-priorily known knowledge about the application in hand.
Our goal is to extract a discriminative representation of the multimodal data that leads to easily finding its essential characteristics in the subsequent analysis step, e.g., regression and classification. To be more precise, sparse coding is about representing signals as linear combinations of a small number of bases from a dictionary. The idea is to learn a dictionary that encodes intrinsic properties of the multimodal data in a decomposition coefficient vector that is favorable towards the maximal discriminatory power.
We carefully design a multimodal representation framework to learn discriminative feature representations by fully exploiting, the modality-shared which is the information shared by various modalities, and modality-specific which is the information content of each modality individually. Plus, it automatically learns the weights for various feature components in a data-driven scheme. In other words, the physical interpretation of our learning framework is to fully exploit the correlated characteristics of the available modalities, while at the same time leverage the modality-specific character of each modality and change their corresponding weights for different parts of the feature in recognition
Handbook of Vascular Biometrics
This open access handbook provides the first comprehensive overview of biometrics exploiting the shape of human blood vessels for biometric recognition, i.e. vascular biometrics, including finger vein recognition, hand/palm vein recognition, retina recognition, and sclera recognition. After an introductory chapter summarizing the state of the art in and availability of commercial systems and open datasets/open source software, individual chapters focus on specific aspects of one of the biometric modalities, including questions of usability, security, and privacy. The book features contributions from both academia and major industrial manufacturers
Multimodal Biometric Systems for Personal Identification and Authentication using Machine and Deep Learning Classifiers
Multimodal biometrics, using machine and deep learning, has recently gained interest over single biometric modalities. This interest stems from the fact that this technique improves recognition and, thus, provides more security. In fact, by combining the abilities of single biometrics, the fusion of two or more biometric modalities creates a robust recognition system that is resistant to the flaws of individual modalities. However, the excellent recognition of multimodal systems depends on multiple factors, such as the fusion scheme, fusion technique, feature extraction techniques, and classification method.
In machine learning, existing works generally use different algorithms for feature extraction of modalities, which makes the system more complex. On the other hand, deep learning, with its ability to extract features automatically, has made recognition more efficient and accurate. Studies deploying deep learning algorithms in multimodal biometric systems tried to find a good compromise between the false acceptance and the false rejection rates (FAR and FRR) to choose the threshold in the matching step. This manual choice is not optimal and depends on the expertise of the solution designer, hence the need to automatize this step. From this perspective, the second part of this thesis details an end-to-end CNN algorithm with an automatic matching mechanism.
This thesis has conducted two studies on face and iris multimodal biometric recognition. The first study proposes a new feature extraction technique for biometric systems based on machine learning. The iris and facial features extraction is performed using the Discrete Wavelet Transform (DWT) combined
with the Singular Value Decomposition (SVD). Merging the relevant characteristics of the two modalities is used to create a pattern for an individual in the dataset. The experimental results show the robustness of our proposed technique and the efficiency when using the same feature extraction technique for both modalities. The proposed method outperformed the state-of-the-art and gave an accuracy of 98.90%.
The second study proposes a deep learning approach using DensNet121 and FaceNet for iris and faces multimodal recognition using feature-level fusion and a new automatic matching technique. The proposed automatic matching approach does not use the threshold to ensure a better compromise between performance and FAR and FRR errors. However, it uses a trained multilayer perceptron (MLP) model that allows people’s automatic classification into two classes: recognized and unrecognized. This platform ensures an accurate and fully automatic process of multimodal recognition. The results obtained by the DenseNet121-FaceNet model by adopting feature-level fusion and automatic matching are very satisfactory. The proposed deep learning models give 99.78% of accuracy, and 99.56% of precision, with 0.22% of FRR and without FAR errors.
The proposed and developed platform solutions in this thesis were tested and vali- dated in two different case studies, the central pharmacy of Al-Asria Eye Clinic in Dubai and the Abu Dhabi Police General Headquarters (Police GHQ). The solution allows fast identification of the persons authorized to access the different rooms. It thus protects the pharmacy against any medication abuse and the red zone in the military zone against the unauthorized use of weapons
- …