506 research outputs found

    Advanced Biometrics with Deep Learning

    Get PDF
    Biometrics, such as fingerprint, iris, face, hand print, hand vein, speech and gait recognition, etc., as a means of identity management have become commonplace nowadays for various applications. Biometric systems follow a typical pipeline, that is composed of separate preprocessing, feature extraction and classification. Deep learning as a data-driven representation learning approach has been shown to be a promising alternative to conventional data-agnostic and handcrafted pre-processing and feature extraction for biometric systems. Furthermore, deep learning offers an end-to-end learning paradigm to unify preprocessing, feature extraction, and recognition, based solely on biometric data. This Special Issue has collected 12 high-quality, state-of-the-art research papers that deal with challenging issues in advanced biometric systems based on deep learning. The 12 papers can be divided into 4 categories according to biometric modality; namely, face biometrics, medical electronic signals (EEG and ECG), voice print, and others

    Homogeneous and Heterogeneous Face Recognition: Enhancing, Encoding and Matching for Practical Applications

    Get PDF
    Face Recognition is the automatic processing of face images with the purpose to recognize individuals. Recognition task becomes especially challenging in surveillance applications, where images are acquired from a long range in the presence of difficult environments. Short Wave Infrared (SWIR) is an emerging imaging modality that is able to produce clear long range images in difficult environments or during night time. Despite the benefits of the SWIR technology, matching SWIR images against a gallery of visible images presents a challenge, since the photometric properties of the images in the two spectral bands are highly distinct.;In this dissertation, we describe a cross spectral matching method that encodes magnitude and phase of multi-spectral face images filtered with a bank of Gabor filters. The magnitude of filtered images is encoded with Simplified Weber Local Descriptor (SWLD) and Local Binary Pattern (LBP) operators. The phase is encoded with Generalized Local Binary Pattern (GLBP) operator. Encoded multi-spectral images are mapped into a histogram representation and cross matched by applying symmetric Kullback-Leibler distance. Performance of the developed algorithm is demonstrated on TINDERS database that contains long range SWIR and color images acquired at a distance of 2, 50, and 106 meters.;Apart from long acquisition range, other variations and distortions such as pose variation, motion and out of focus blur, and uneven illumination may be observed in multispectral face images. Recognition performance of the face recognition matcher can be greatly affected by these distortions. It is important, therefore, to ensure that matching is performed on high quality images. Poor quality images have to be either enhanced or discarded. This dissertation addresses the problem of selecting good quality samples.;The last chapters of the dissertation suggest a number of modifications applied to the cross spectral matching algorithm for matching low resolution color images in near-real time. We show that the method that encodes the magnitude of Gabor filtered images with the SWLD operator guarantees high recognition rates. The modified method (Gabor-SWLD) is adopted in a camera network set up where cameras acquire several views of the same individual. The designed algorithm and software are fully automated and optimized to perform recognition in near-real time. We evaluate the recognition performance and the processing time of the method on a small dataset collected at WVU

    Face Image and Video Analysis in Biometrics and Health Applications

    Get PDF
    Computer Vision (CV) enables computers and systems to derive meaningful information from acquired visual inputs, such as images and videos, and make decisions based on the extracted information. Its goal is to acquire, process, analyze, and understand the information by developing a theoretical and algorithmic model. Biometrics are distinctive and measurable human characteristics used to label or describe individuals by combining computer vision with knowledge of human physiology (e.g., face, iris, fingerprint) and behavior (e.g., gait, gaze, voice). Face is one of the most informative biometric traits. Many studies have investigated the human face from the perspectives of various different disciplines, ranging from computer vision, deep learning, to neuroscience and biometrics. In this work, we analyze the face characteristics from digital images and videos in the areas of morphing attack and defense, and autism diagnosis. For face morphing attacks generation, we proposed a transformer based generative adversarial network to generate more visually realistic morphing attacks by combining different losses, such as face matching distance, facial landmark based loss, perceptual loss and pixel-wise mean square error. In face morphing attack detection study, we designed a fusion-based few-shot learning (FSL) method to learn discriminative features from face images for few-shot morphing attack detection (FS-MAD), and extend the current binary detection into multiclass classification, namely, few-shot morphing attack fingerprinting (FS-MAF). In the autism diagnosis study, we developed a discriminative few shot learning method to analyze hour-long video data and explored the fusion of facial dynamics for facial trait classification of autism spectrum disorder (ASD) in three severity levels. The results show outstanding performance of the proposed fusion-based few-shot framework on the dataset. Besides, we further explored the possibility of performing face micro- expression spotting and feature analysis on autism video data to classify ASD and control groups. The results indicate the effectiveness of subtle facial expression changes on autism diagnosis

    Machine Learning Approaches to Human Body Shape Analysis

    Get PDF
    Soft biometrics, biomedical sciences, and many other fields of study pay particular attention to the study of the geometric description of the human body, and its variations. Although multiple contributions, the interest is particularly high given the non-rigid nature of the human body, capable of assuming different poses, and numerous shapes due to variable body composition. Unfortunately, a well-known costly requirement in data-driven machine learning, and particularly in the human-based analysis, is the availability of data, in the form of geometric information (body measurements) with related vision information (natural images, 3D mesh, etc.). We introduce a computer graphics framework able to generate thousands of synthetic human body meshes, representing a population of individuals with stratified information: gender, Body Fat Percentage (BFP), anthropometric measurements, and pose. This contribution permits an extensive analysis of different bodies in different poses, avoiding the demanding, and expensive acquisition process. We design a virtual environment able to take advantage of the generated bodies, to infer the body surface area (BSA) from a single view. The framework permits to simulate the acquisition process of newly introduced RGB-D devices disentangling different noise components (sensor noise, optical distortion, body part occlusions). Common geometric descriptors in soft biometric, as well as in biomedical sciences, are based on body measurements. Unfortunately, as we prove, these descriptors are not pose invariant, constraining the usability in controlled scenarios. We introduce a differential geometry approach assuming body pose variations as isometric transformations of the body surface, and body composition changes covariant to the body surface area. This setting permits the use of the Laplace-Beltrami operator on the 2D body manifold, describing the body with a compact, efficient, and pose invariant representation. We design a neural network architecture able to infer important body semantics from spectral descriptors, closing the gap between abstract spectral features, and traditional measurement-based indices. Studying the manifold of body shapes, we propose an innovative generative adversarial model able to learn the body shapes. The method permits to generate new bodies with unseen geometries as a walk on the latent space, constituting a significant advantage over traditional generative methods

    Towards Unsupervised Domain Adaptation for Diabetic Retinopathy Detection in the Tromsø Eye Study

    Get PDF
    Diabetic retinopathy (DR) is an eye disease which affects a third of the diabetic population. It is a preventable disease, but requires early detection for efficient treatment. While there has been increasing interest in applying deep learning techniques for DR detection in order to aid practitioners make more accurate diagnosis, these efforts are mainly focused on datasets that have been collected or created with ML in mind. In this thesis, however, we take a look at two particular datasets that have been collected at the University Hospital of North-Norway - UNN. These datasets have inherent problems that motivate the methodological choices in this work such as a variable number of input images and domain shift. We therefore contribute a multi-stream model for DR classification. The multi-stream model can model dependency across different images, can take in a variable of input of any size, is general in its detection such that the image processing is equal no matter which stream the image enters, and is compatible with the domain adaptation method ADDA, but we argue the model is compatible with many other methods. As a remedy for these problems, we propose a multi-stream deep learning architecture that is uniquely tailored to these datasets and illustrate how domain adaptation might be utilized within the framework to learn efficiently in the presence of domain shift. Our experiments demonstrates the models properties empirically, and shows it can deal with each of the presented problems. The model this paper contributes is a first step towards DR detection from these local datasets and, in the bigger picture, similar datasets worldwide

    Irish Machine Vision and Image Processing Conference Proceedings 2017

    Get PDF

    Scaling full seismic waveform inversions

    Get PDF
    The main goal of this research study is to scale full seismic waveform inversions using the adjoint-state method to the data volumes that are nowadays available in seismology. Practical issues hinder the routine application of this, to a certain extent theoretically well understood, method. To a large part this comes down to outdated or flat out missing tools and ways to automate the highly iterative procedure in a reliable way. This thesis tackles these issues in three successive stages. It first introduces a modern and properly designed data processing framework sitting at the very core of all the consecutive developments. The ObsPy toolkit is a Python library providing a bridge for seismology into the scientific Python ecosystem and bestowing seismologists with effortless I/O and a powerful signal processing library, amongst other things. The following chapter deals with a framework designed to handle the specific data management and organization issues arising in full seismic waveform inversions, the Large-scale Seismic Inversion Framework. It has been created to orchestrate the various pieces of data accruing in the course of an iterative waveform inversion. Then, the Adaptable Seismic Data Format, a new, self-describing, and scalable data format for seismology is introduced along with the rationale why it is needed for full waveform inversions in particular and seismology in general. Finally, these developments are put into service to construct a novel full seismic waveform inversion model for elastic subsurface structure beneath the North American continent and the Northern Atlantic well into Europe. The spectral element method is used for the forward and adjoint simulations coupled with windowed time-frequency phase misfit measurements. Later iterations use 72 events, all happening after the USArray project has commenced, resulting in approximately 150`000 three components recordings that are inverted for. 20 L-BFGS iterations yield a model that can produce complete seismograms at a period range between 30 and 120 seconds while comparing favorably to observed data
    • …
    corecore