2,028 research outputs found
Improving Face Sketch Recognition via Adversarial Sketch-Photo Transformation
International audiencefeature learning [7]-[10]. The benefit of the former category relates to the conversion of sketches into the same modality as photos, and hence lies in the ability to utilize existing photo-based face recognition methods. Thus, the applicability of the existing photo-based face recognition algorithms can be greatly expanded. Current methods for face photo-sketch transformation can be mainly grouped into example-based methods and regression-based methods. Example-based methods assume that the corresponding sketches (or patches of sketches) of two similar face photos (or patches of face photos) are also similar. Such methods rely on face photo-sketch pairs in the training set to synthesize images. In order to achieve good transformation results, these methods usually require a large number of photo-sketch pairs. However, the computational cost may also grow linearly with the increase of the training set size. Regression-based methods overcome the issues mentioned above and the most time-consuming part only exists in the training stage when learning the mapping between face photos and sketches, but the inference/testing stage can be fast. In this paper, we propose a Generative Adversarial Network (GAN) for face sketch-to-photo transformation , leveraging the advantages of CycleGAN [11] and conditional GANs [12]. We have designed a new feature-level loss, which is jointly used with the traditional image-level adversarial loss to ensure the quality of the synthesized photos. The proposed approach outperforms state-of-the-art approaches for synthesizing photos in terms of structural similarity index (SSIM). More importantly, the synthesized photos of our approach are found to be more instrumental in improving the sketch-to-photo matching accuracy. The rest of this paper is organized as follows: Section II summarizes representative methods of face photo-to-sketch transformation, and GANs. Section III provides details of the proposed method and the designed feature-level loss. Experimental results and analysis are presented in Section IV. Finally, we conclude this work in Section V. Abstract-Face sketch-photo transformation has broad applications in forensics, law enforcement, and digital entertainment, particular for face recognition systems that are designed for photo-to-photo matching. While there are a number of methods for face photo-to-sketch transformation, studies on sketch-to-photo transformation remain limited. In this paper, we propose a novel conditional CycleGAN for face sketch-to-photo transformation. Specifically, we leverage the advantages of CycleGAN and conditional GANs and design a feature-level loss to assure the high quality of the generated face photos from sketches. The generated face photos are used, as a replacement of face sketches, and particularly for face identification against a gallery set of mugshot photos. Experimental results on the public-domain database CUFSF show that the proposed approach is able to generate realistic photos from sketches, and the generated photos are instrumental in improving the sketch identification accuracy against a large gallery set
Adversarial sketch-photo transformation for enhanced face recognition accuracy: a systematic analysis and evaluation
This research provides a strategy for enhancing the precision of face sketch identification through adversarial sketch-photo transformation. The approach uses a generative adversarial network (GAN) to learn to convert sketches into photographs, which may subsequently be utilized to enhance the precision of face sketch identification. The suggested method is evaluated in comparison to state-of-the-art face sketch recognition and synthesis techniques, such as sketchy GAN, similarity-preserving GAN (SPGAN), and super-resolution GAN (SRGAN). Possible domains of use for the proposed adversarial sketch-photo transformation approach include law enforcement, where reliable face sketch recognition is essential for the identification of suspects. The suggested approach can be generalized to various contexts, such as the creation of creative photographs from drawings or the conversion of pictures between modalities. The suggested method outperforms state-of-the-art face sketch recognition and synthesis techniques, confirming the usefulness of adversarial learning in this context. Our method is highly efficient for photo-sketch synthesis, with a structural similarity index (SSIM) of 0.65 on The Chinese University of Hong Kong dataset and 0.70 on the custom-generated dataset
Discriminative Region Proposal Adversarial Networks for High-Quality Image-to-Image Translation
Image-to-image translation has been made much progress with embracing
Generative Adversarial Networks (GANs). However, it's still very challenging
for translation tasks that require high quality, especially at high-resolution
and photorealism. In this paper, we present Discriminative Region Proposal
Adversarial Networks (DRPAN) for high-quality image-to-image translation. We
decompose the procedure of image-to-image translation task into three iterated
steps, first is to generate an image with global structure but some local
artifacts (via GAN), second is using our DRPnet to propose the most fake region
from the generated image, and third is to implement "image inpainting" on the
most fake region for more realistic result through a reviser, so that the
system (DRPAN) can be gradually optimized to synthesize images with more
attention on the most artifact local part. Experiments on a variety of
image-to-image translation tasks and datasets validate that our method
outperforms state-of-the-arts for producing high-quality translation results in
terms of both human perceptual studies and automatic quantitative measures.Comment: ECCV 201
Deep Learning Architectures for Heterogeneous Face Recognition
Face recognition has been one of the most challenging areas of research in biometrics and computer vision. Many face recognition algorithms are designed to address illumination and pose problems for visible face images. In recent years, there has been significant amount of research in Heterogeneous Face Recognition (HFR). The large modality gap between faces captured in different spectrum as well as lack of training data makes heterogeneous face recognition (HFR) quite a challenging problem. In this work, we present different deep learning frameworks to address the problem of matching non-visible face photos against a gallery of visible faces.
Algorithms for thermal-to-visible face recognition can be categorized as cross-spectrum feature-based methods, or cross-spectrum image synthesis methods. In cross-spectrum feature-based face recognition a thermal probe is matched against a gallery of visible faces corresponding to the real-world scenario, in a feature subspace. The second category synthesizes a visible-like image from a thermal image which can then be used by any commercial visible spectrum face recognition system. These methods also beneficial in the sense that the synthesized visible face image can be directly utilized by existing face recognition systems which operate only on the visible face imagery. Therefore, using this approach one can leverage the existing commercial-off-the-shelf (COTS) and government-off-the-shelf (GOTS) solutions. In addition, the synthesized images can be used by human examiners for different purposes.
There are some informative traits, such as age, gender, ethnicity, race, and hair color, which are not distinctive enough for the sake of recognition, but still can act as complementary information to other primary information, such as face and fingerprint. These traits, which are known as soft biometrics, can improve recognition algorithms while they are much cheaper and faster to acquire. They can be directly used in a unimodal system for some applications. Usually, soft biometric traits have been utilized jointly with hard biometrics (face photo) for different tasks in the sense that they are considered to be available both during the training and testing phases. In our approaches we look at this problem in a different way. We consider the case when soft biometric information does not exist during the testing phase, and our method can predict them directly in a multi-tasking paradigm.
There are situations in which training data might come equipped with additional information that can be modeled as an auxiliary view of the data, and that unfortunately is not available during testing. This is the LUPI scenario. We introduce a novel framework based on deep learning techniques that leverages the auxiliary view to improve the performance of recognition system. We do so by introducing a formulation that is general, in the sense that can be used with any visual classifier.
Every use of auxiliary information has been validated extensively using publicly available benchmark datasets, and several new state-of-the-art accuracy performance values have been set. Examples of application domains include visual object recognition from RGB images and from depth data, handwritten digit recognition, and gesture recognition from video.
We also design a novel aggregation framework which optimizes the landmark locations directly using only one image without requiring any extra prior which leads to robust alignment given arbitrary face deformations. Three different approaches are employed to generate the manipulated faces and two of them perform the manipulation via the adversarial attacks to fool a face recognizer. This step can decouple from our framework and potentially used to enhance other landmark detectors. Aggregation of the manipulated faces in different branches of proposed method leads to robust landmark detection.
Finally we focus on the generative adversarial networks which is a very powerful tool in synthesizing a visible-like images from the non-visible images. The main goal of a generative model is to approximate the true data distribution which is not known. In general, the choice for modeling the density function is challenging. Explicit models have the advantage of explicitly calculating the probability densities. There are two well-known implicit approaches, namely the Generative Adversarial Network (GAN) and Variational AutoEncoder (VAE) which try to model the data distribution implicitly. The VAEs try to maximize the data likelihood lower bound, while a GAN performs a minimax game between two players during its optimization. GANs overlook the explicit data density characteristics which leads to undesirable quantitative evaluations and mode collapse. This causes the generator to create similar looking images with poor diversity of samples. In the last chapter of thesis, we focus to address this issue in GANs framework
- …