6,968 research outputs found

    Palmprint Gender Classification Using Deep Learning Methods

    Get PDF
    Gender identification is an important technique that can improve the performance of authentication systems by reducing searching space and speeding up the matching process. Several biometric traits have been used to ascertain human gender. Among them, the human palmprint possesses several discriminating features such as principal-lines, wrinkles, ridges, and minutiae features and that offer cues for gender identification. The goal of this work is to develop novel deep-learning techniques to determine gender from palmprint images. PolyU and CASIA palmprint databases with 90,000 and 5502 images respectively were used for training and testing purposes in this research. After ROI extraction and data augmentation were performed, various convolutional and deep learning-based classification approaches were empirically designed, optimized, and tested. Results of gender classification as high as 94.87% were achieved on the PolyU palmprint database and 90.70% accuracy on the CASIA palmprint database. Optimal performance was achieved by combining two different pre-trained and fine-tuned deep CNNs (VGGNet and DenseNet) through score level average fusion. In addition, Gradient-weighted Class Activation Mapping (Grad-CAM) was also implemented to ascertain which specific regions of the palmprint are most discriminative for gender classification

    Motion Segmentation from Clustering of Sparse Point Features Using Spatially Constrained Mixture Models

    Get PDF
    Motion is one of the strongest cues available for segmentation. While motion segmentation finds wide ranging applications in object detection, tracking, surveillance, robotics, image and video compression, scene reconstruction, video editing, and so on, it faces various challenges such as accurate motion recovery from noisy data, varying complexity of the models required to describe the computed image motion, the dynamic nature of the scene that may include a large number of independently moving objects undergoing occlusions, and the need to make high-level decisions while dealing with long image sequences. Keeping the sparse point features as the pivotal point, this thesis presents three distinct approaches that address some of the above mentioned motion segmentation challenges. The first part deals with the detection and tracking of sparse point features in image sequences. A framework is proposed where point features can be tracked jointly. Traditionally, sparse features have been tracked independently of one another. Combining the ideas from Lucas-Kanade and Horn-Schunck, this thesis presents a technique in which the estimated motion of a feature is influenced by the motion of the neighboring features. The joint feature tracking algorithm leads to an improved tracking performance over the standard Lucas-Kanade based tracking approach, especially while tracking features in untextured regions. The second part is related to motion segmentation using sparse point feature trajectories. The approach utilizes a spatially constrained mixture model framework and a greedy EM algorithm to group point features. In contrast to previous work, the algorithm is incremental in nature and allows for an arbitrary number of objects traveling at different relative speeds to be segmented, thus eliminating the need for an explicit initialization of the number of groups. The primary parameter used by the algorithm is the amount of evidence that must be accumulated before the features are grouped. A statistical goodness-of-fit test monitors the change in the motion parameters of a group over time in order to automatically update the reference frame. The approach works in real time and is able to segment various challenging sequences captured from still and moving cameras that contain multiple independently moving objects and motion blur. The third part of this thesis deals with the use of specialized models for motion segmentation. The articulated human motion is chosen as a representative example that requires a complex model to be accurately described. A motion-based approach for segmentation, tracking, and pose estimation of articulated bodies is presented. The human body is represented using the trajectories of a number of sparse points. A novel motion descriptor encodes the spatial relationships of the motion vectors representing various parts of the person and can discriminate between articulated and non-articulated motions, as well as between various pose and view angles. Furthermore, a nearest neighbor search for the closest motion descriptor from the labeled training data consisting of the human gait cycle in multiple views is performed, and this distance is fed to a Hidden Markov Model defined over multiple poses and viewpoints to obtain temporally consistent pose estimates. Experimental results on various sequences of walking subjects with multiple viewpoints and scale demonstrate the effectiveness of the approach. In particular, the purely motion based approach is able to track people in night-time sequences, even when the appearance based cues are not available. Finally, an application of image segmentation is presented in the context of iris segmentation. Iris is a widely used biometric for recognition and is known to be highly accurate if the segmentation of the iris region is near perfect. Non-ideal situations arise when the iris undergoes occlusion by eyelashes or eyelids, or the overall quality of the segmented iris is affected by illumination changes, or due to out-of-plane rotation of the eye. The proposed iris segmentation approach combines the appearance and the geometry of the eye to segment iris regions from non-ideal images. The image is modeled as a Markov random field, and a graph cuts based energy minimization algorithm is applied to label the pixels either as eyelashes, pupil, iris, or background using texture and image intensity information. The iris shape is modeled as an ellipse and is used to refine the pixel based segmentation. The results indicate the effectiveness of the segmentation algorithm in handling non-ideal iris images

    Biometric security: A novel ear recognition approach using a 3D morphable ear model

    Get PDF
    Biometrics is a critical component of cybersecurity that identifies persons by verifying their behavioral and physical traits. In biometric-based authentication, each individual can be correctly recognized based on their intrinsic behavioral or physical features, such as face, fingerprint, iris, and ears. This work proposes a novel approach for human identification using 3D ear images. Usually, in conventional methods, the probe image is registered with each gallery image using computational heavy registration algorithms, making it practically infeasible due to the time-consuming recognition process. Therefore, this work proposes a recognition pipeline that reduces the one-to-one registration between probe and gallery. First, a deep learning-based algorithm is used for ear detection in 3D side face images. Second, a statistical ear model known as a 3D morphable ear model (3DMEM), was constructed to use as a feature extractor from the detected ear images. Finally, a novel recognition algorithm named you morph once (YMO) is proposed for human recognition that reduces the computational time by eliminating one-to-one registration between probe and gallery, which only calculates the distance between the parameters stored in the gallery and the probe. The experimental results show the significance of the proposed method for a real-time application

    Multimodal Biometric Systems for Personal Identification and Authentication using Machine and Deep Learning Classifiers

    Get PDF
    Multimodal biometrics, using machine and deep learning, has recently gained interest over single biometric modalities. This interest stems from the fact that this technique improves recognition and, thus, provides more security. In fact, by combining the abilities of single biometrics, the fusion of two or more biometric modalities creates a robust recognition system that is resistant to the flaws of individual modalities. However, the excellent recognition of multimodal systems depends on multiple factors, such as the fusion scheme, fusion technique, feature extraction techniques, and classification method. In machine learning, existing works generally use different algorithms for feature extraction of modalities, which makes the system more complex. On the other hand, deep learning, with its ability to extract features automatically, has made recognition more efficient and accurate. Studies deploying deep learning algorithms in multimodal biometric systems tried to find a good compromise between the false acceptance and the false rejection rates (FAR and FRR) to choose the threshold in the matching step. This manual choice is not optimal and depends on the expertise of the solution designer, hence the need to automatize this step. From this perspective, the second part of this thesis details an end-to-end CNN algorithm with an automatic matching mechanism. This thesis has conducted two studies on face and iris multimodal biometric recognition. The first study proposes a new feature extraction technique for biometric systems based on machine learning. The iris and facial features extraction is performed using the Discrete Wavelet Transform (DWT) combined with the Singular Value Decomposition (SVD). Merging the relevant characteristics of the two modalities is used to create a pattern for an individual in the dataset. The experimental results show the robustness of our proposed technique and the efficiency when using the same feature extraction technique for both modalities. The proposed method outperformed the state-of-the-art and gave an accuracy of 98.90%. The second study proposes a deep learning approach using DensNet121 and FaceNet for iris and faces multimodal recognition using feature-level fusion and a new automatic matching technique. The proposed automatic matching approach does not use the threshold to ensure a better compromise between performance and FAR and FRR errors. However, it uses a trained multilayer perceptron (MLP) model that allows people’s automatic classification into two classes: recognized and unrecognized. This platform ensures an accurate and fully automatic process of multimodal recognition. The results obtained by the DenseNet121-FaceNet model by adopting feature-level fusion and automatic matching are very satisfactory. The proposed deep learning models give 99.78% of accuracy, and 99.56% of precision, with 0.22% of FRR and without FAR errors. The proposed and developed platform solutions in this thesis were tested and vali- dated in two different case studies, the central pharmacy of Al-Asria Eye Clinic in Dubai and the Abu Dhabi Police General Headquarters (Police GHQ). The solution allows fast identification of the persons authorized to access the different rooms. It thus protects the pharmacy against any medication abuse and the red zone in the military zone against the unauthorized use of weapons

    Infrared face recognition: a comprehensive review of methodologies and databases

    Full text link
    Automatic face recognition is an area with immense practical potential which includes a wide range of commercial and law enforcement applications. Hence it is unsurprising that it continues to be one of the most active research areas of computer vision. Even after over three decades of intense research, the state-of-the-art in face recognition continues to improve, benefitting from advances in a range of different research fields such as image processing, pattern recognition, computer graphics, and physiology. Systems based on visible spectrum images, the most researched face recognition modality, have reached a significant level of maturity with some practical success. However, they continue to face challenges in the presence of illumination, pose and expression changes, as well as facial disguises, all of which can significantly decrease recognition accuracy. Amongst various approaches which have been proposed in an attempt to overcome these limitations, the use of infrared (IR) imaging has emerged as a particularly promising research direction. This paper presents a comprehensive and timely review of the literature on this subject. Our key contributions are: (i) a summary of the inherent properties of infrared imaging which makes this modality promising in the context of face recognition, (ii) a systematic review of the most influential approaches, with a focus on emerging common trends as well as key differences between alternative methodologies, (iii) a description of the main databases of infrared facial images available to the researcher, and lastly (iv) a discussion of the most promising avenues for future research.Comment: Pattern Recognition, 2014. arXiv admin note: substantial text overlap with arXiv:1306.160

    Deep Adversarial Frameworks for Visually Explainable Periocular Recognition

    Get PDF
    Machine Learning (ML) models have pushed state­of­the­art performance closer to (and even beyond) human level. However, the core of such algorithms is usually latent and hardly understandable. Thus, the field of Explainability focuses on researching and adopting techniques that can explain the reasons that support a model’s predictions. Such explanations of the decision­making process would help to build trust between said model and the human(s) using it. An explainable system also allows for better debugging, during the training phase, and fixing, upon deployment. But why should a developer devote time and effort into refactoring or rethinking Artificial Intelligence (AI) systems, to make them more transparent? Don’t they work just fine? Despite the temptation to answer ”yes”, are we really considering the cases where these systems fail? Are we assuming that ”almost perfect” accuracy is good enough? What if, some of the cases where these systems get it right, were just a small margin away from a complete miss? Does that even matter? Considering the ever­growing presence of ML models in crucial areas like forensics, security and healthcare services, it clearly does. Motivating these concerns is the fact that powerful systems often operate as black­boxes, hiding the core reasoning underneath layers of abstraction [Gue]. In this scenario, there could be some seriously negative outcomes if opaque algorithms gamble on the presence of tumours in X­ray images or the way autonomous vehicles behave in traffic. It becomes clear, then, that incorporating explainability with AI is imperative. More recently, the politicians have addressed this urgency through the General Data Protection Regulation (GDPR) [Com18]. With this document, the European Union (EU) brings forward several important concepts, amongst which, the ”right to an explanation”. The definition and scope are still subject to debate [MF17], but these are definite strides to formally regulate the explainable depth of autonomous systems. Based on the preface above, this work describes a periocular recognition framework that not only performs biometric recognition but also provides clear representations of the features/regions that support a prediction. Being particularly designed to explain non­match (”impostors”) decisions, our solution uses adversarial generative techniques to synthesise a large set of ”genuine” image pairs, from where the most similar elements with respect to a query are retrieved. Then, assuming the alignment between the query/retrieved pairs, the element­wise differences between the query and a weighted average of the retrieved elements yields a visual explanation of the regions in the query pair that would have to be different to transform it into a ”genuine” pair. Our quantitative and qualitative experiments validate the proposed solution, yielding recognition rates that are similar to the state­of­the­art, while adding visually pleasing explanations
    • …
    corecore