6 research outputs found

    Spoofing Faces Using Makeup: An Investigative Study

    Get PDF
    International audienceMakeup can be used to alter the facial appearance of a person. Previous studies have established the potential of using makeup to obfuscate the identity of an individual with respect to an automated face matcher. In this work, we analyze the potential of using makeup for spoofing an identity, where an individual attempts to impersonate another per-son's facial appearance. In this regard, we first assemble a set of face images downloaded from the internet where individuals use facial cosmetics to impersonate celebrities. We next determine the impact of this alteration on two different face matchers. Experiments suggest that automated face matchers are vulnerable to makeup-induced spoofing and that the success of spoofing is impacted by the appearance of the impersonator's face and the target face being spoofed. Further, an identification experiment is conducted to show that the spoofed faces are successfully matched at better ranks after the application of makeup. To the best of our knowledge, this is the first work that systematically studies the impact of makeup-induced face spoofing on automated face recognition

    An ensemble of patch-based subspaces for makeup-robust face recognition

    No full text
    International audienceRecent research has demonstrated the negative impact of makeup on automated face recognition. In this work, we introduce a patch-based ensemble learning method, which uses multiple subspaces generated by sampling patches from before-makeup and after-makeup face images, to address this problem. In the proposed scheme, each face image is tessellated into patches and each patch is represented by a set of feature descriptors, viz., Local Gradient Gabor Pattern (LGGP), Histogram of Gabor Ordinal Ratio Measures (HGORM) and Densely Sampled Local Binary Pattern (DS-LBP). Then, an improved Random Subspace Linear Discriminant Analysis (SRS-LDA) method is used to perform ensemble learning by sampling patches and constructing multiple common subspaces between before-makeup and after-makeup facial images. Finally, Collaborative-based and Sparse-based Representation Classifiers are used to compare feature vectors in this subspace and the resulting scores are combined via the sum-rule. The proposed face matching algorithm is evaluated on the YMU makeup dataset and is shown to achieve very good results. It outperforms other methods designed specifically for the makeup problem

    Face recognition in video surveillance from a single reference sample through domain adaptation

    Get PDF
    Face recognition (FR) has received significant attention during the past decades in many applications, such as law enforcement, forensics, access controls, information security and video surveillance (VS), due to its covert and non-intrusive nature. FR systems specialized for VS seek to accurately detect the presence of target individuals of interest over a distributed network of video cameras under uncontrolled capture conditions. Therefore, recognizing faces of target individuals in such environment is a challenging problem because the appearance of faces varies due to changes in pose, scale, illumination, occlusion, blur, etc. The computational complexity is also an important consideration because of the growing number of cameras, and the processing time of state-of-the-art face detection, tracking and matching algorithms. In this thesis, adaptive systems are proposed for accurate still-to-video FR, where a single (or very few) reference still or a mug-shot is available to design a facial model for the target individual. This is a common situation in real-world watch-list screening applications due to the cost and feasibility of capturing reference stills, and managing facial models over time. The limited number of reference stills can adversely affect the robustness of facial models to intra-class variations, and therefore the performance of still-to-video FR systems. Moreover, a specific challenge in still-to-video FR is the shift between the enrollment domain, where high-quality reference faces are captured under controlled conditions from still cameras, and the operational domain, where faces are captured with video cameras under uncontrolled conditions. To overcome the challenges of such single sample per person (SSPP) problems, 3 new systems are proposed for accurate still-to-video FR that are based on multiple face representations and domain adaptation. In particular, this thesis presents 3 contributions. These contributions are described with more details in the following statements. In Chapter 3, a multi-classifier framework is proposed for robust still-to-video FR based on multiple and diverse face representations of a single reference face still. During enrollment of a target individual, the single reference face still is modeled using an ensemble of SVM classifiers based on different patches and face descriptors. Multiple feature extraction techniques are applied to patches isolated in the reference still to generate a diverse SVM pool that provides robustness to common nuisance factors (e.g., variations in illumination and pose). The estimation of discriminant feature subsets, classifier parameters, decision thresholds, and ensemble fusion functions is achieved using the high-quality reference still and a large number of faces captured in lower quality video of non-target individuals in the scene. During operations, the most competent subset of SVMs are dynamically selected according to capture conditions. Finally, a head-face tracker gradually regroups faces captured from different people appearing in a scene, while each individual-specific ensemble performs face matching. The accumulation of matching scores per face track leads to a robust spatio-temporal FR when accumulated ensemble scores surpass a detection threshold. Experimental results obtained with the Chokepoint and COX-S2V datasets show a significant improvement in performance w.r.t. reference systems, especially when individual-specific ensembles (1) are designed using exemplar-SVMs rather than one-class SVMs, and (2) exploit score-level fusion of local SVMs (trained using features extracted from each patch), rather than using either decision-level or feature-level fusion with a global SVM (trained by concatenating features extracted from patches). In Chapter 4, an efficient multi-classifier system (MCS) is proposed for accurate still-to-video FR based on multiple face representations and domain adaptation (DA). An individual-specific ensemble of exemplar-SVM (e-SVM) classifiers is thereby designed to improve robustness to intra-class variations. During enrollment of a target individual, an ensemble is used to model the single reference still, where multiple face descriptors and random feature subspaces allow to generate a diverse pool of patch-wise classifiers. To adapt these ensembles to the operational domains, e-SVMs are trained using labeled face patches extracted from the reference still versus patches extracted from cohort and other non-target stills mixed with unlabeled patches extracted from the corresponding face trajectories captured with surveillance cameras. During operations, the most competent classifiers per given probe face are dynamically selected and weighted based on the internal criteria determined in the feature space of e-SVMs. This chapter also investigates the impact of using different training schemes for DA, as well as, the validation set of non-target faces extracted from stills and video trajectories of unknown individuals in the operational domain. The results indicate that the proposed system can surpass state-of-the-art accuracy, yet with a significantly lower computational complexity. In Chapter 5, a deep convolutional neural network (CNN) is proposed to cope with the discrepancies between facial regions of interest (ROIs) isolated in still and video faces for robust still-to-video FR. To that end, a face-flow autoencoder CNN called FFA-CNN is trained using both still and video ROIs in a supervised end-to-end multi-task learning. A novel loss function containing a weighted combination of pixel-wise, symmetry-wise and identity preserving losses is introduced to optimize the network parameters. The proposed FFA-CNN incorporates a reconstruction network and a fully-connected classification network, where the former reconstructs a well-illuminated frontal ROI with neutral expression from a pair of low-quality non-frontal video ROIs and the latter is utilized to compare the still and video representations to provide matching scores. Thus, integrating the proposed weighted loss function with a supervised end-to-end training approach leads to generate high-quality frontal faces and learn discriminative face representations similar for the same identities. Simulation results obtained over challenging COX Face DB confirm the effectiveness of the proposed FFA-CNN to achieve convincing performance compared to current state-of-the-art CNN-based FR systems
    corecore