251,354 research outputs found
Face Recognition: A Novel Multi-Level Taxonomy based Survey
In a world where security issues have been gaining growing importance, face
recognition systems have attracted increasing attention in multiple application
areas, ranging from forensics and surveillance to commerce and entertainment.
To help understanding the landscape and abstraction levels relevant for face
recognition systems, face recognition taxonomies allow a deeper dissection and
comparison of the existing solutions. This paper proposes a new, more
encompassing and richer multi-level face recognition taxonomy, facilitating the
organization and categorization of available and emerging face recognition
solutions; this taxonomy may also guide researchers in the development of more
efficient face recognition solutions. The proposed multi-level taxonomy
considers levels related to the face structure, feature support and feature
extraction approach. Following the proposed taxonomy, a comprehensive survey of
representative face recognition solutions is presented. The paper concludes
with a discussion on current algorithmic and application related challenges
which may define future research directions for face recognition.Comment: This paper is a preprint of a paper submitted to IET Biometrics. If
accepted, the copy of record will be available at the IET Digital Librar
High Fidelity Face Manipulation with Extreme Poses and Expressions
Face manipulation has shown remarkable advances with the flourish of
Generative Adversarial Networks. However, due to the difficulties of
controlling structures and textures, it is challenging to model poses and
expressions simultaneously, especially for the extreme manipulation at
high-resolution. In this paper, we propose a novel framework that simplifies
face manipulation into two correlated stages: a boundary prediction stage and a
disentangled face synthesis stage. The first stage models poses and expressions
jointly via boundary images. Specifically, a conditional encoder-decoder
network is employed to predict the boundary image of the target face in a
semi-supervised way. Pose and expression estimators are introduced to improve
the prediction performance. In the second stage, the predicted boundary image
and the input face image are encoded into the structure and the texture latent
space by two encoder networks, respectively. A proxy network and a feature
threshold loss are further imposed to disentangle the latent space.
Furthermore, due to the lack of high-resolution face manipulation databases to
verify the effectiveness of our method, we collect a new high-quality
Multi-View Face (MVF-HQ) database. It contains 120,283 images at 6000x4000
resolution from 479 identities with diverse poses, expressions, and
illuminations. MVF-HQ is much larger in scale and much higher in resolution
than publicly available high-resolution face manipulation databases. We will
release MVF-HQ soon to push forward the advance of face manipulation.
Qualitative and quantitative experiments on four databases show that our method
dramatically improves the synthesis quality.Comment: Accepted by IEEE Transactions on Information Forensics and Security
(TIFS
A Fast and Accurate Unconstrained Face Detector
We propose a method to address challenges in unconstrained face detection,
such as arbitrary pose variations and occlusions. First, a new image feature
called Normalized Pixel Difference (NPD) is proposed. NPD feature is computed
as the difference to sum ratio between two pixel values, inspired by the Weber
Fraction in experimental psychology. The new feature is scale invariant,
bounded, and is able to reconstruct the original image. Second, we propose a
deep quadratic tree to learn the optimal subset of NPD features and their
combinations, so that complex face manifolds can be partitioned by the learned
rules. This way, only a single soft-cascade classifier is needed to handle
unconstrained face detection. Furthermore, we show that the NPD features can be
efficiently obtained from a look up table, and the detection template can be
easily scaled, making the proposed face detector very fast. Experimental
results on three public face datasets (FDDB, GENKI, and CMU-MIT) show that the
proposed method achieves state-of-the-art performance in detecting
unconstrained faces with arbitrary pose variations and occlusions in cluttered
scenes.Comment: This paper has been accepted by TPAMI. The source code is available
on the project page
http://www.cbsr.ia.ac.cn/users/scliao/projects/npdface/index.htm
Face Identification using Local Ternary Tree Pattern based Spatial Structural Components
This paper reports a face identification system which makes use of a novel
local descriptor called Local Ternary Tree Pattern (LTTP). Exploiting and
extracting distinctive local descriptor from a face image plays a crucial role
in face identification task in the presence of a variety of face images
including constrained, unconstrained and plastic surgery images. LTTP has been
used to extract robust and useful spatial features which use to describe the
various structural components on a face. To extract the features, a ternary
tree is formed for each pixel with its eight neighbors in each block. LTTP
pattern can be generated in four forms such as LTTP Left Depth (LTTP LD), LTTP
Left Breadth (LTTP LB), LTTP Right Depth (LTTP RD) and LTTP Right Breadth (LTTP
RB). The encoding schemes of these patterns are very simple and efficient in
terms of computational as well as time complexity. The proposed face
identification system is tested on six face databases, namely, the UMIST, the
JAFFE, the extended Yale face B, the Plastic Surgery, the LFW and the UFI. The
experimental evaluation demonstrates the most promising results considering a
variety of faces captured under different environments. The proposed LTTP based
system is also compared with some local descriptors under identical conditions.Comment: 13 pages, 5 figures, conference pape
Robust Face Recognition with Structural Binary Gradient Patterns
This paper presents a computationally efficient yet powerful binary framework
for robust facial representation based on image gradients. It is termed as
structural binary gradient patterns (SBGP). To discover underlying local
structures in the gradient domain, we compute image gradients from multiple
directions and simplify them into a set of binary strings. The SBGP is derived
from certain types of these binary strings that have meaningful local
structures and are capable of resembling fundamental textural information. They
detect micro orientational edges and possess strong orientation and locality
capabilities, thus enabling great discrimination. The SBGP also benefits from
the advantages of the gradient domain and exhibits profound robustness against
illumination variations. The binary strategy realized by pixel correlations in
a small neighborhood substantially simplifies the computational complexity and
achieves extremely efficient processing with only 0.0032s in Matlab for a
typical face image. Furthermore, the discrimination power of the SBGP can be
enhanced on a set of defined orientational image gradient magnitudes, further
enforcing locality and orientation. Results of extensive experiments on various
benchmark databases illustrate significant improvements of the SBGP based
representations over the existing state-of-the-art local descriptors in the
terms of discrimination, robustness and complexity. Codes for the SBGP methods
will be available at
http://www.eee.manchester.ac.uk/research/groups/sisp/software/
HEp-2 Cell Classification via Fusing Texture and Shape Information
Indirect Immunofluorescence (IIF) HEp-2 cell image is an effective evidence
for diagnosis of autoimmune diseases. Recently computer-aided diagnosis of
autoimmune diseases by IIF HEp-2 cell classification has attracted great
attention. However the HEp-2 cell classification task is quite challenging due
to large intra-class variation and small between-class variation. In this paper
we propose an effective and efficient approach for the automatic classification
of IIF HEp-2 cell image by fusing multi-resolution texture information and
richer shape information. To be specific, we propose to: a) capture the
multi-resolution texture information by a novel Pairwise Rotation Invariant
Co-occurrence of Local Gabor Binary Pattern (PRICoLGBP) descriptor, b) depict
the richer shape information by using an Improved Fisher Vector (IFV) model
with RootSIFT features which are sampled from large image patches in multiple
scales, and c) combine them properly. We evaluate systematically the proposed
approach on the IEEE International Conference on Pattern Recognition (ICPR)
2012, IEEE International Conference on Image Processing (ICIP) 2013 and ICPR
2014 contest data sets. The experimental results for the proposed methods
significantly outperform the winners of ICPR 2012 and ICIP 2013 contest, and
achieve comparable performance with the winner of the newly released ICPR 2014
contest.Comment: 11 pages, 7 figure
Interest Point Detection based on Adaptive Ternary Coding
In this paper, an adaptive pixel ternary coding mechanism is proposed and a
contrast invariant and noise resistant interest point detector is developed on
the basis of this mechanism. Every pixel in a local region is adaptively
encoded into one of the three statuses: bright, uncertain and dark. The blob
significance of the local region is measured by the spatial distribution of the
bright and dark pixels. Interest points are extracted from this blob
significance measurement. By labeling the statuses of ternary bright,
uncertain, and dark, the proposed detector shows more robustness to image noise
and quantization errors. Moreover, the adaptive strategy for the ternary
cording, which relies on two thresholds that automatically converge to the
median of the local region in measurement, enables this coding to be
insensitive to the image local contrast. As a result, the proposed detector is
invariant to illumination changes. The state-of-the-art results are achieved on
the standard datasets, and also in the face recognition application
Class Rectification Hard Mining for Imbalanced Deep Learning
Recognising detailed facial or clothing attributes in images of people is a
challenging task for computer vision, especially when the training data are
both in very large scale and extremely imbalanced among different attribute
classes. To address this problem, we formulate a novel scheme for batch
incremental hard sample mining of minority attribute classes from imbalanced
large scale training data. We develop an end-to-end deep learning framework
capable of avoiding the dominant effect of majority classes by discovering
sparsely sampled boundaries of minority classes. This is made possible by
introducing a Class Rectification Loss (CRL) regularising algorithm. We
demonstrate the advantages and scalability of CRL over existing
state-of-the-art attribute recognition and imbalanced data learning models on
two large scale imbalanced benchmark datasets, the CelebA facial attribute
dataset and the X-Domain clothing attribute dataset
Feature Fusion using Extended Jaccard Graph and Stochastic Gradient Descent for Robot
Robot vision is a fundamental device for human-robot interaction and robot
complex tasks. In this paper, we use Kinect and propose a feature graph fusion
(FGF) for robot recognition. Our feature fusion utilizes RGB and depth
information to construct fused feature from Kinect. FGF involves multi-Jaccard
similarity to compute a robust graph and utilize word embedding method to
enhance the recognition results. We also collect DUT RGB-D face dataset and a
benchmark datset to evaluate the effectiveness and efficiency of our method.
The experimental results illustrate FGF is robust and effective to face and
object datasets in robot applications.Comment: Assembly Automatio
Adversarial Discriminative Heterogeneous Face Recognition
The gap between sensing patterns of different face modalities remains a
challenging problem in heterogeneous face recognition (HFR). This paper
proposes an adversarial discriminative feature learning framework to close the
sensing gap via adversarial learning on both raw-pixel space and compact
feature space. This framework integrates cross-spectral face hallucination and
discriminative feature learning into an end-to-end adversarial network. In the
pixel space, we make use of generative adversarial networks to perform
cross-spectral face hallucination. An elaborate two-path model is introduced to
alleviate the lack of paired images, which gives consideration to both global
structures and local textures. In the feature space, an adversarial loss and a
high-order variance discrepancy loss are employed to measure the global and
local discrepancy between two heterogeneous distributions respectively. These
two losses enhance domain-invariant feature learning and modality independent
noise removing. Experimental results on three NIR-VIS databases show that our
proposed approach outperforms state-of-the-art HFR methods, without requiring
of complex network or large-scale training dataset
- …