6,731 research outputs found
Illumination tolerance in facial recognition
In this research work, five different preprocessing techniques were experimented with two different classifiers to find the best match for preprocessor + classifier combination to built an illumination tolerant face recognition system. Hence, a face recognition system is proposed based on illumination normalization techniques and linear subspace model using two distance metrics on three challenging, yet interesting databases. The databases are CAS PEAL database, the Extended Yale B database, and the AT&T database. The research takes the form of experimentation and analysis in which five illumination normalization techniques were compared and analyzed using two different distance metrics. The performances and execution times of the various techniques were recorded and measured for accuracy and efficiency. The illumination normalization techniques were Gamma Intensity Correction (GIC), discrete Cosine Transform (DCT), Histogram Remapping using Normal distribution (HRN), Histogram Remapping using Log-normal distribution (HRL), and Anisotropic Smoothing technique (AS). The linear subspace models utilized were principal component analysis (PCA) and Linear Discriminant Analysis (LDA). The two distance metrics were Euclidean and Cosine distance. The result showed that for databases with both illumination (shadows), and lighting (over-exposure) variations like the CAS PEAL database the Histogram remapping technique with normal distribution produced excellent result when the cosine distance is used as the classifier. The result indicated 65% recognition rate in 15.8 ms/img. Alternatively for databases consisting of pure illumination variation, like the extended Yale B database, the Gamma Intensity Correction (GIC) merged with the Euclidean distance metric gave the most accurate result with 95.4% recognition accuracy in 1ms/img. It was further gathered from the set of experiments that the cosine distance produces more accurate result compared to the Euclidean distance metric. However the Euclidean distance is faster than the cosine distance in all the experiments conducted
Fast Landmark Localization with 3D Component Reconstruction and CNN for Cross-Pose Recognition
Two approaches are proposed for cross-pose face recognition, one is based on
the 3D reconstruction of facial components and the other is based on the deep
Convolutional Neural Network (CNN). Unlike most 3D approaches that consider
holistic faces, the proposed approach considers 3D facial components. It
segments a 2D gallery face into components, reconstructs the 3D surface for
each component, and recognizes a probe face by component features. The
segmentation is based on the landmarks located by a hierarchical algorithm that
combines the Faster R-CNN for face detection and the Reduced Tree Structured
Model for landmark localization. The core part of the CNN-based approach is a
revised VGG network. We study the performances with different settings on the
training set, including the synthesized data from 3D reconstruction, the
real-life data from an in-the-wild database, and both types of data combined.
We investigate the performances of the network when it is employed as a
classifier or designed as a feature extractor. The two recognition approaches
and the fast landmark localization are evaluated in extensive experiments, and
compared to stateof-the-art methods to demonstrate their efficacy.Comment: 14 pages, 12 figures, 4 table
On Designing Tattoo Registration and Matching Approaches in the Visible and SWIR Bands
Face, iris and fingerprint based biometric systems are well explored areas of research. However, there are law enforcement and military applications where neither of the aforementioned modalities may be available to be exploited for human identification. In such applications, soft biometrics may be the only clue available that can be used for identification or verification purposes. Tattoo is an example of such a soft biometric trait. Unlike face-based biometric systems that used in both same-spectral and cross-spectral matching scenarios, tattoo-based human identification is still a not fully explored area of research. At this point in time there are no pre-processing, feature extraction and matching algorithms using tattoo images captured at multiple bands. This thesis is focused on exploring solutions on two main challenging problems. The first one is cross-spectral tattoo matching. The proposed algorithmic approach is using as an input raw Short-Wave Infrared (SWIR) band tattoo images and matches them successfully against their visible band counterparts. The SWIR tattoo images are captured at 1100 nm, 1200 nm, 1300 nm, 1400 nm and 1500 nm. After an empirical study where multiple photometric normalization techniques were used to pre-process the original multi-band tattoo images, only one was determined to significantly improve cross spectral tattoo matching performance. The second challenging problem was to develop a fully automatic visible-based tattoo image registration system based on SIFT descriptors and the RANSAC algorithm with a homography model. The proposed automated registration approach significantly improves the operational cost of a tattoo image identification system (using large scale tattoo image datasets), where the alignment of a pair of tattoo images by system operators needs to be performed manually. At the same time, tattoo matching accuracy is also improved (before vs. after automated alignment) by 45.87% for the NIST-Tatt-C database and 12.65% for the WVU-Tatt database
Learning to Personalize in Appearance-Based Gaze Tracking
Personal variations severely limit the performance of appearance-based gaze
tracking. Adapting to these variations using standard neural network model
adaptation methods is difficult. The problems range from overfitting, due to
small amounts of training data, to underfitting, due to restrictive model
architectures. We tackle these problems by introducing the SPatial Adaptive
GaZe Estimator (SPAZE). By modeling personal variations as a low-dimensional
latent parameter space, SPAZE provides just enough adaptability to capture the
range of personal variations without being prone to overfitting. Calibrating
SPAZE for a new person reduces to solving a small optimization problem. SPAZE
achieves an error of 2.70 degrees with 9 calibration samples on MPIIGaze,
improving on the state-of-the-art by 14 %. We contribute to gaze tracking
research by empirically showing that personal variations are well-modeled as a
3-dimensional latent parameter space for each eye. We show that this
low-dimensionality is expected by examining model-based approaches to gaze
tracking. We also show that accurate head pose-free gaze tracking is possible
- …