4 research outputs found
Cyclic Style Generative Adversarial Network for Near Infrared and Visible Light Face Recognition
Face recognition in the visible light (VIS) spectrum has been widely utilized in many practical applications. With the development of the deep learning method, the recognition accuracy and speed have already reached an excellent level, where face recognition can be applied in various circumstances. However, in some extreme situations, there are still problems that face recognition cannot guarantee performance. One of the most significant cases is under poor illumination. Lacking light sources, images cannot show the true identities of detected people. To address such a problem, the near infrared (NIR) spectrum offers an alternative solution to face recognition in which face images can be captured clearly. Studies have been made in recent years, and current near infrared and visible light (NIR-VIS) face recognition methods have achieved great performance.
In this thesis, I review current NIR-VIS face recognition methods and public NIR-VIS face datasets. I first list public NIR-VIS face datasets that are used in most research. For each dataset, I represent their characteristics, including the number of subjects, collection environment, resolution of images, and whether paired or not. Also, I conclude evaluation protocols for each dataset, helping with further analyzing of performances. Then, I classify current NIR-VIS face recognition methods into three categories, image synthesis-based methods, subspace learning-based methods, and invariant feature-based methods. The contribution of each method is concisely explained. Additionally, I make comparisons between current NIR-VIS face recognition methods and propose my own opinion on the advantages and disadvantages of these methods.
To improve the shortcomings of current methods, this thesis proposes a new model, Cyclic Style Generative Adversarial Network (CS-GAN), which is a combination of image synthesis-based method and subspace learning-based method. The proposed CS-GAN improves the visualization results of image synthesis between the NIR domain and VIS domain as well as recognition accuracy. The CS-GAN is based on the Style-GAN 3 network which was proposed in 2021. In the proposed model, there are two generators from pre-trained Style-GAN 3 which generate images in the NIR domain and VIS domain, respectively. The generators consist of a mapping network and synthesis network, where the mapping network disentangles the latent code for reducing correlation between features, and the synthesis network synthesizes face images through progressive growing training. The generators have different final layers, a to-RGB layer for the VIS domain and a to-grayscale layer for the NIR domain. Generators are embedded in a cyclic structure, in which latent codes are sent into the synthesis network in the other generator for recreated images, and recreated images are compared with real images which in the same domain to ensure domain consistency. Besides, I apply the proposed cyclic subspace learning. The cyclic subspace learning is composed of two parts. The first part introduces the proposed latent loss which is to have better controls over the learning of latent subspace. The latent codes influence both details and locations of features through continuously inputting into the synthesis network. The control over latent subspace can strengthen the feature consistency between synthesized images. And the second part improves the style-transferring process by controlling high-level features with perceptual loss in each domain. In the perceptual loss, there is a pre-trained VGG-16 network to extract high-level features which can be regarded as the style of the images. Therefore, style loss can control the style of images in both domains as well as ensure style consistency between synthesized images and real images. The visualization results show that the proposed CS-GAN model can synthesize better VIS images that are detailed, corrected colorized, and with clear edges. More importantly, the experimental results show that the Rank-1 accuracy on CASISA NIR-VIS 2.0 database reaches 99.60% which improves state-of-the-art methods by 0.2%
Recommended from our members
Computational Face Recognition Using Machine Learning Models
Faces are among the most complex stimuli that the human visual system
processes. Growing commercial interest in face recognition is encouraging, but it
also turns out to be a challenging endeavour. These challenges arise when the
situations are complex and cause varied facial appearance due to e.g., occlusion,
low-resolution, and ageing. The problem of computer-based face recognition
using partial facial data is still largely an unexplored area of research and how
does computer interpret various parts of the face. Another challenge is age
progression and regression, which is considered to be the most revealing topic
for understanding the human face changes during life.
In this research, the various computational face recognition models are
investigated to overcome the challenges posed by ageing and occlusions/partial
faces. For partial face-based face recognition, a pre-trained VGGF model is
employed for feature extraction and then followed by popular classifiers such as
SVMs and Cosine Similarity CS for classification. In this framework, parts of faces
such as eyes, nose, forehead, are used individually for training and testing. The
results showing that there is an improvement in recognition in small parts, such
as recognition rate in forehead enhanced form about 0% to nearly 35%, eyes
from about 22% to approximately 65%. In the second framework, five sub-models
were built based on Convolutional Neural Networks (CNNs) and those models
are named Eyes-CNNs, Nose-CNNs, Mouth-CNNs, Forehead-CNNs, and
combined EyesNose-CNNs. The experimental results illustrate a high recognition
rate when it comes to small parts, for example, eyes increased up to about
90.83% and forehead reached about 44.5%. Furthermore, the challenge of face
ageing is also approached by proposing an age-template based framework,
generating an age-based face template for enhanced face generation and
recognition. The results showing that generated new aged faces are more reliable
comparing with state-of-the-art