1,046 research outputs found
From BoW to CNN: Two Decades of Texture Representation for Texture Classification
Texture is a fundamental characteristic of many types of images, and texture
representation is one of the essential and challenging problems in computer
vision and pattern recognition which has attracted extensive research
attention. Since 2000, texture representations based on Bag of Words (BoW) and
on Convolutional Neural Networks (CNNs) have been extensively studied with
impressive performance. Given this period of remarkable evolution, this paper
aims to present a comprehensive survey of advances in texture representation
over the last two decades. More than 200 major publications are cited in this
survey covering different aspects of the research, which includes (i) problem
description; (ii) recent advances in the broad categories of BoW-based,
CNN-based and attribute-based methods; and (iii) evaluation issues,
specifically benchmark datasets and state of the art results. In retrospect of
what has been achieved so far, the survey discusses open challenges and
directions for future research.Comment: Accepted by IJC
Face Identification with Second-Order Pooling
Automatic face recognition has received significant performance improvement
by developing specialised facial image representations. On the other hand,
generic object recognition has rarely been applied to the face recognition.
Spatial pyramid pooling of features encoded by an over-complete dictionary has
been the key component of many state-of-the-art image classification systems.
Inspired by its success, in this work we develop a new face image
representation method inspired by the second-order pooling in Carreira et al.
[1], which was originally proposed for image segmentation. The proposed method
differs from the previous methods in that, we encode the densely extracted
local patches by a small-size dictionary; and the facial image signatures are
obtained by pooling the second-order statistics of the encoded features. We
show the importance of pooling on encoded features, which is bypassed by the
original second-order pooling method to avoid the high computational cost.
Equipped with a simple linear classifier, the proposed method outperforms the
state-of-the-art face identification performance by large margins. For example,
on the LFW databases, the proposed method performs better than the previous
best by around 13% accuracy.Comment: 9 page
Face Image Classification by Pooling Raw Features
We propose a very simple, efficient yet surprisingly effective feature
extraction method for face recognition (about 20 lines of Matlab code), which
is mainly inspired by spatial pyramid pooling in generic image classification.
We show that features formed by simply pooling local patches over a multi-level
pyramid, coupled with a linear classifier, can significantly outperform most
recent face recognition methods. The simplicity of our feature extraction
procedure is demonstrated by the fact that no learning is involved (except PCA
whitening). We show that, multi-level spatial pooling and dense extraction of
multi-scale patches play critical roles in face image classification. The
extracted facial features can capture strong structural information of
individual faces with no label information being used. We also find that,
pre-processing on local image patches such as contrast normalization can have
an important impact on the classification accuracy. In particular, on the
challenging face recognition datasets of FERET and LFW-a, our method improves
previous best results by more than 10% and 20%, respectively.Comment: 12 page
Generic Image Classification Approaches Excel on Face Recognition
The main finding of this work is that the standard image classification
pipeline, which consists of dictionary learning, feature encoding, spatial
pyramid pooling and linear classification, outperforms all state-of-the-art
face recognition methods on the tested benchmark datasets (we have tested on
AR, Extended Yale B, the challenging FERET, and LFW-a datasets). This
surprising and prominent result suggests that those advances in generic image
classification can be directly applied to improve face recognition systems. In
other words, face recognition may not need to be viewed as a separate object
classification problem.
While recently a large body of residual based face recognition methods focus
on developing complex dictionary learning algorithms, in this work we show that
a dictionary of randomly extracted patches (even from non-face images) can
achieve very promising results using the image classification pipeline. That
means, the choice of dictionary learning methods may not be important. Instead,
we find that learning multiple dictionaries using different low-level image
features often improve the final classification accuracy. Our proposed face
recognition approach offers the best reported results on the widely-used face
recognition benchmark datasets. In particular, on the challenging FERET and
LFW-a datasets, we improve the best reported accuracies in the literature by
about 20% and 30% respectively.Comment: 10 page
HEp-2 Cell Classification via Fusing Texture and Shape Information
Indirect Immunofluorescence (IIF) HEp-2 cell image is an effective evidence
for diagnosis of autoimmune diseases. Recently computer-aided diagnosis of
autoimmune diseases by IIF HEp-2 cell classification has attracted great
attention. However the HEp-2 cell classification task is quite challenging due
to large intra-class variation and small between-class variation. In this paper
we propose an effective and efficient approach for the automatic classification
of IIF HEp-2 cell image by fusing multi-resolution texture information and
richer shape information. To be specific, we propose to: a) capture the
multi-resolution texture information by a novel Pairwise Rotation Invariant
Co-occurrence of Local Gabor Binary Pattern (PRICoLGBP) descriptor, b) depict
the richer shape information by using an Improved Fisher Vector (IFV) model
with RootSIFT features which are sampled from large image patches in multiple
scales, and c) combine them properly. We evaluate systematically the proposed
approach on the IEEE International Conference on Pattern Recognition (ICPR)
2012, IEEE International Conference on Image Processing (ICIP) 2013 and ICPR
2014 contest data sets. The experimental results for the proposed methods
significantly outperform the winners of ICPR 2012 and ICIP 2013 contest, and
achieve comparable performance with the winner of the newly released ICPR 2014
contest.Comment: 11 pages, 7 figure
Automatic Facial Expression Recognition Using Features of Salient Facial Patches
Extraction of discriminative features from salient facial patches plays a
vital role in effective facial expression recognition. The accurate detection
of facial landmarks improves the localization of the salient patches on face
images. This paper proposes a novel framework for expression recognition by
using appearance features of selected facial patches. A few prominent facial
patches, depending on the position of facial landmarks, are extracted which are
active during emotion elicitation. These active patches are further processed
to obtain the salient patches which contain discriminative features for
classification of each pair of expressions, thereby selecting different facial
patches as salient for different pair of expression classes. One-against-one
classification method is adopted using these features. In addition, an
automated learning-free facial landmark detection technique has been proposed,
which achieves similar performances as that of other state-of-art landmark
detection methods, yet requires significantly less execution time. The proposed
method is found to perform well consistently in different resolutions, hence,
providing a solution for expression recognition in low resolution images.
Experiments on CK+ and JAFFE facial expression databases show the effectiveness
of the proposed system
Large-scale Supervised Hierarchical Feature Learning for Face Recognition
This paper proposes a novel face recognition algorithm based on large-scale
supervised hierarchical feature learning. The approach consists of two parts:
hierarchical feature learning and large-scale model learning. The hierarchical
feature learning searches feature in three levels of granularity in a
supervised way. First, face images are modeled by receptive field theory, and
the representation is an image with many channels of Gaussian receptive maps.
We activate a few most distinguish channels by supervised learning. Second, the
face image is further represented by patches of picked channels, and we search
from the over-complete patch pool to activate only those most discriminant
patches. Third, the feature descriptor of each patch is further projected to
lower dimension subspace with discriminant subspace analysis.
Learned feature of activated patches are concatenated to get a full face
representation.A linear classifier is learned to separate face pairs from same
subjects and different subjects. As the number of face pairs are extremely
large, we introduce ADMM (alternative direction method of multipliers) to train
the linear classifier on a computing cluster. Experiments show that more
training samples will bring notable accuracy improvement.
We conduct experiments on FRGC and LFW. Results show that the proposed
approach outperforms existing algorithms under the same protocol notably.
Besides, the proposed approach is small in memory footprint, and low in
computing cost, which makes it suitable for embedded applications.Comment: 8 pages; 3 figure
Cross-pose Face Recognition by Canonical Correlation Analysis
The pose problem is one of the bottlenecks in automatic face recognition. We
argue that one of the diffculties in this problem is the severe misalignment in
face images or feature vectors with different poses. In this paper, we propose
that this problem can be statistically solved or at least mitigated by
maximizing the intra-subject across-pose correlations via canonical correlation
analysis (CCA). In our method, based on the data set with coupled face images
of the same identities and across two different poses, CCA learns
simultaneously two linear transforms, each for one pose. In the transformed
subspace, the intra-subject correlations between the different poses are
maximized, which implies pose-invariance or pose-robustness is achieved. The
experimental results show that our approach could considerably improve the
recognition performance. And if further enhanced with holistic+local feature
representation, the performance could be comparable to the state-of-the-art
Shape Primitive Histogram: A Novel Low-Level Face Representation for Face Recognition
We further exploit the representational power of Haar wavelet and present a
novel low-level face representation named Shape Primitives Histogram (SPH) for
face recognition. Since human faces exist abundant shape features, we address
the face representation issue from the perspective of the shape feature
extraction. In our approach, we divide faces into a number of tiny shape
fragments and reduce these shape fragments to several uniform atomic shape
patterns called Shape Primitives. A convolution with Haar Wavelet templates is
applied to each shape fragment to identify its belonging shape primitive. After
that, we do a histogram statistic of shape primitives in each spatial local
image patch for incorporating the spatial information. Finally, each face is
represented as a feature vector via concatenating all the local histograms of
shape primitives. Four popular face databases, namely ORL, AR, Yale-B and LFW-a
databases, are employed to evaluate SPH and experimentally study the choices of
the parameters. Extensive experimental results demonstrate that the proposed
approach outperform the state-of-the-arts.Comment: second version, two columns and 11 page
Vision-based Human Gender Recognition: A Survey
Gender is an important demographic attribute of people. This paper provides a
survey of human gender recognition in computer vision. A review of approaches
exploiting information from face and whole body (either from a still image or
gait sequence) is presented. We highlight the challenges faced and survey the
representative methods of these approaches. Based on the results, good
performance have been achieved for datasets captured under controlled
environments, but there is still much work that can be done to improve the
robustness of gender recognition under real-life environments.Comment: 30 page
- …