Search CORE

251,354 research outputs found

Face Recognition: A Novel Multi-Level Taxonomy based Survey

Author: Correia Paulo Lobato
Pereira Fernando
Sepas-Moghaddam Alireza
Publication venue
Publication date: 03/01/2019
Field of study

In a world where security issues have been gaining growing importance, face recognition systems have attracted increasing attention in multiple application areas, ranging from forensics and surveillance to commerce and entertainment. To help understanding the landscape and abstraction levels relevant for face recognition systems, face recognition taxonomies allow a deeper dissection and comparison of the existing solutions. This paper proposes a new, more encompassing and richer multi-level face recognition taxonomy, facilitating the organization and categorization of available and emerging face recognition solutions; this taxonomy may also guide researchers in the development of more efficient face recognition solutions. The proposed multi-level taxonomy considers levels related to the face structure, feature support and feature extraction approach. Following the proposed taxonomy, a comprehensive survey of representative face recognition solutions is presented. The paper concludes with a discussion on current algorithmic and application related challenges which may define future research directions for face recognition.Comment: This paper is a preprint of a paper submitted to IET Biometrics. If accepted, the copy of record will be available at the IET Digital Librar

arXiv.org e-Print Archive

High Fidelity Face Manipulation with Extreme Poses and Expressions

Author: Fu Chaoyou
He Ran
Hu Yibo
Wang Guoli
Wu Xiang
Zhang Qian
Publication venue
Publication date: 16/01/2021
Field of study

Face manipulation has shown remarkable advances with the flourish of Generative Adversarial Networks. However, due to the difficulties of controlling structures and textures, it is challenging to model poses and expressions simultaneously, especially for the extreme manipulation at high-resolution. In this paper, we propose a novel framework that simplifies face manipulation into two correlated stages: a boundary prediction stage and a disentangled face synthesis stage. The first stage models poses and expressions jointly via boundary images. Specifically, a conditional encoder-decoder network is employed to predict the boundary image of the target face in a semi-supervised way. Pose and expression estimators are introduced to improve the prediction performance. In the second stage, the predicted boundary image and the input face image are encoded into the structure and the texture latent space by two encoder networks, respectively. A proxy network and a feature threshold loss are further imposed to disentangle the latent space. Furthermore, due to the lack of high-resolution face manipulation databases to verify the effectiveness of our method, we collect a new high-quality Multi-View Face (MVF-HQ) database. It contains 120,283 images at 6000x4000 resolution from 479 identities with diverse poses, expressions, and illuminations. MVF-HQ is much larger in scale and much higher in resolution than publicly available high-resolution face manipulation databases. We will release MVF-HQ soon to push forward the advance of face manipulation. Qualitative and quantitative experiments on four databases show that our method dramatically improves the synthesis quality.Comment: Accepted by IEEE Transactions on Information Forensics and Security (TIFS

arXiv.org e-Print Archive

A Fast and Accurate Unconstrained Face Detector

Author: Jain Anil K.
Li Stan Z.
Liao Shengcai
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 07/09/2015
Field of study

We propose a method to address challenges in unconstrained face detection, such as arbitrary pose variations and occlusions. First, a new image feature called Normalized Pixel Difference (NPD) is proposed. NPD feature is computed as the difference to sum ratio between two pixel values, inspired by the Weber Fraction in experimental psychology. The new feature is scale invariant, bounded, and is able to reconstruct the original image. Second, we propose a deep quadratic tree to learn the optimal subset of NPD features and their combinations, so that complex face manifolds can be partitioned by the learned rules. This way, only a single soft-cascade classifier is needed to handle unconstrained face detection. Furthermore, we show that the NPD features can be efficiently obtained from a look up table, and the detection template can be easily scaled, making the proposed face detector very fast. Experimental results on three public face datasets (FDDB, GENKI, and CMU-MIT) show that the proposed method achieves state-of-the-art performance in detecting unconstrained faces with arbitrary pose variations and occlusions in cluttered scenes.Comment: This paper has been accepted by TPAMI. The source code is available on the project page http://www.cbsr.ia.ac.cn/users/scliao/projects/npdface/index.htm

arXiv.org e-Print Archive

Face Identification using Local Ternary Tree Pattern based Spatial Structural Components

Author: Gupta Phalguni
Kisku Dakshina Ranjan
Rakshit Rinku Datta
Tistarelli Massimo
Publication venue
Publication date: 16/07/2020
Field of study

This paper reports a face identification system which makes use of a novel local descriptor called Local Ternary Tree Pattern (LTTP). Exploiting and extracting distinctive local descriptor from a face image plays a crucial role in face identification task in the presence of a variety of face images including constrained, unconstrained and plastic surgery images. LTTP has been used to extract robust and useful spatial features which use to describe the various structural components on a face. To extract the features, a ternary tree is formed for each pixel with its eight neighbors in each block. LTTP pattern can be generated in four forms such as LTTP Left Depth (LTTP LD), LTTP Left Breadth (LTTP LB), LTTP Right Depth (LTTP RD) and LTTP Right Breadth (LTTP RB). The encoding schemes of these patterns are very simple and efficient in terms of computational as well as time complexity. The proposed face identification system is tested on six face databases, namely, the UMIST, the JAFFE, the extended Yale face B, the Plastic Surgery, the LFW and the UFI. The experimental evaluation demonstrates the most promising results considering a variety of faces captured under different environments. The proposed LTTP based system is also compared with some local descriptors under identical conditions.Comment: 13 pages, 5 figures, conference pape

arXiv.org e-Print Archive

Robust Face Recognition with Structural Binary Gradient Patterns

Author: Huang Weilin
Yin Hujun
Publication venue
Publication date: 01/06/2015
Field of study

This paper presents a computationally efficient yet powerful binary framework for robust facial representation based on image gradients. It is termed as structural binary gradient patterns (SBGP). To discover underlying local structures in the gradient domain, we compute image gradients from multiple directions and simplify them into a set of binary strings. The SBGP is derived from certain types of these binary strings that have meaningful local structures and are capable of resembling fundamental textural information. They detect micro orientational edges and possess strong orientation and locality capabilities, thus enabling great discrimination. The SBGP also benefits from the advantages of the gradient domain and exhibits profound robustness against illumination variations. The binary strategy realized by pixel correlations in a small neighborhood substantially simplifies the computational complexity and achieves extremely efficient processing with only 0.0032s in Matlab for a typical face image. Furthermore, the discrimination power of the SBGP can be enhanced on a set of defined orientational image gradient magnitudes, further enforcing locality and orientation. Results of extensive experiments on various benchmark databases illustrate significant improvements of the SBGP based representations over the existing state-of-the-art local descriptors in the terms of discrimination, robustness and complexity. Codes for the SBGP methods will be available at http://www.eee.manchester.ac.uk/research/groups/sisp/software/

arXiv.org e-Print Archive

HEp-2 Cell Classification via Fusing Texture and Shape Information

Author: Guo Jun
Li Chun-Guang
Pietikäinen Matti
Qi Xianbiao
Zhao Guoying
Publication venue
Publication date: 16/02/2015
Field of study

Indirect Immunofluorescence (IIF) HEp-2 cell image is an effective evidence for diagnosis of autoimmune diseases. Recently computer-aided diagnosis of autoimmune diseases by IIF HEp-2 cell classification has attracted great attention. However the HEp-2 cell classification task is quite challenging due to large intra-class variation and small between-class variation. In this paper we propose an effective and efficient approach for the automatic classification of IIF HEp-2 cell image by fusing multi-resolution texture information and richer shape information. To be specific, we propose to: a) capture the multi-resolution texture information by a novel Pairwise Rotation Invariant Co-occurrence of Local Gabor Binary Pattern (PRICoLGBP) descriptor, b) depict the richer shape information by using an Improved Fisher Vector (IFV) model with RootSIFT features which are sampled from large image patches in multiple scales, and c) combine them properly. We evaluate systematically the proposed approach on the IEEE International Conference on Pattern Recognition (ICPR) 2012, IEEE International Conference on Image Processing (ICIP) 2013 and ICPR 2014 contest data sets. The experimental results for the proposed methods significantly outperform the winners of ICPR 2012 and ICIP 2013 contest, and achieve comparable performance with the winner of the newly released ICPR 2014 contest.Comment: 11 pages, 7 figure

arXiv.org e-Print Archive

Interest Point Detection based on Adaptive Ternary Coding

Author: Jiang Xudong
Miao Zhenwei
Yap Kim-Hui
Publication venue
Publication date: 31/12/2018
Field of study

In this paper, an adaptive pixel ternary coding mechanism is proposed and a contrast invariant and noise resistant interest point detector is developed on the basis of this mechanism. Every pixel in a local region is adaptively encoded into one of the three statuses: bright, uncertain and dark. The blob significance of the local region is measured by the spatial distribution of the bright and dark pixels. Interest points are extracted from this blob significance measurement. By labeling the statuses of ternary bright, uncertain, and dark, the proposed detector shows more robustness to image noise and quantization errors. Moreover, the adaptive strategy for the ternary cording, which relies on two thresholds that automatically converge to the median of the local region in measurement, enables this coding to be insensitive to the image local contrast. As a result, the proposed detector is invariant to illumination changes. The state-of-the-art results are achieved on the standard datasets, and also in the face recognition application

arXiv.org e-Print Archive

Class Rectification Hard Mining for Imbalanced Deep Learning

Author: Dong Qi
Gong Shaogang
Zhu Xiatian
Publication venue
Publication date: 08/12/2017
Field of study

Recognising detailed facial or clothing attributes in images of people is a challenging task for computer vision, especially when the training data are both in very large scale and extremely imbalanced among different attribute classes. To address this problem, we formulate a novel scheme for batch incremental hard sample mining of minority attribute classes from imbalanced large scale training data. We develop an end-to-end deep learning framework capable of avoiding the dominant effect of majority classes by discovering sparsely sampled boundaries of minority classes. This is made possible by introducing a Class Rectification Loss (CRL) regularising algorithm. We demonstrate the advantages and scalability of CRL over existing state-of-the-art attribute recognition and imbalanced data learning models on two large scale imbalanced benchmark datasets, the CelebA facial attribute dataset and the X-Domain clothing attribute dataset

arXiv.org e-Print Archive

Feature Fusion using Extended Jaccard Graph and Stochastic Gradient Descent for Robot

Author: Liu Shenglan
Sun Muxin
Wang Feilong
Wang Wei
Publication venue
Publication date: 24/03/2017
Field of study

Robot vision is a fundamental device for human-robot interaction and robot complex tasks. In this paper, we use Kinect and propose a feature graph fusion (FGF) for robot recognition. Our feature fusion utilizes RGB and depth information to construct fused feature from Kinect. FGF involves multi-Jaccard similarity to compute a robust graph and utilize word embedding method to enhance the recognition results. We also collect DUT RGB-D face dataset and a benchmark datset to evaluate the effectiveness and efficiency of our method. The experimental results illustrate FGF is robust and effective to face and object datasets in robot applications.Comment: Assembly Automatio

arXiv.org e-Print Archive

Adversarial Discriminative Heterogeneous Face Recognition

Author: He Ran
Song Lingxiao
Wu Xiang
Zhang Man
Publication venue
Publication date: 11/09/2017
Field of study

The gap between sensing patterns of different face modalities remains a challenging problem in heterogeneous face recognition (HFR). This paper proposes an adversarial discriminative feature learning framework to close the sensing gap via adversarial learning on both raw-pixel space and compact feature space. This framework integrates cross-spectral face hallucination and discriminative feature learning into an end-to-end adversarial network. In the pixel space, we make use of generative adversarial networks to perform cross-spectral face hallucination. An elaborate two-path model is introduced to alleviate the lack of paired images, which gives consideration to both global structures and local textures. In the feature space, an adversarial loss and a high-order variance discrepancy loss are employed to measure the global and local discrepancy between two heterogeneous distributions respectively. These two losses enhance domain-invariant feature learning and modality independent noise removing. Experimental results on three NIR-VIS databases show that our proposed approach outperforms state-of-the-art HFR methods, without requiring of complex network or large-scale training dataset

arXiv.org e-Print Archive