1,046 research outputs found

    From BoW to CNN: Two Decades of Texture Representation for Texture Classification

    Full text link
    Texture is a fundamental characteristic of many types of images, and texture representation is one of the essential and challenging problems in computer vision and pattern recognition which has attracted extensive research attention. Since 2000, texture representations based on Bag of Words (BoW) and on Convolutional Neural Networks (CNNs) have been extensively studied with impressive performance. Given this period of remarkable evolution, this paper aims to present a comprehensive survey of advances in texture representation over the last two decades. More than 200 major publications are cited in this survey covering different aspects of the research, which includes (i) problem description; (ii) recent advances in the broad categories of BoW-based, CNN-based and attribute-based methods; and (iii) evaluation issues, specifically benchmark datasets and state of the art results. In retrospect of what has been achieved so far, the survey discusses open challenges and directions for future research.Comment: Accepted by IJC

    Face Identification with Second-Order Pooling

    Full text link
    Automatic face recognition has received significant performance improvement by developing specialised facial image representations. On the other hand, generic object recognition has rarely been applied to the face recognition. Spatial pyramid pooling of features encoded by an over-complete dictionary has been the key component of many state-of-the-art image classification systems. Inspired by its success, in this work we develop a new face image representation method inspired by the second-order pooling in Carreira et al. [1], which was originally proposed for image segmentation. The proposed method differs from the previous methods in that, we encode the densely extracted local patches by a small-size dictionary; and the facial image signatures are obtained by pooling the second-order statistics of the encoded features. We show the importance of pooling on encoded features, which is bypassed by the original second-order pooling method to avoid the high computational cost. Equipped with a simple linear classifier, the proposed method outperforms the state-of-the-art face identification performance by large margins. For example, on the LFW databases, the proposed method performs better than the previous best by around 13% accuracy.Comment: 9 page

    Face Image Classification by Pooling Raw Features

    Full text link
    We propose a very simple, efficient yet surprisingly effective feature extraction method for face recognition (about 20 lines of Matlab code), which is mainly inspired by spatial pyramid pooling in generic image classification. We show that features formed by simply pooling local patches over a multi-level pyramid, coupled with a linear classifier, can significantly outperform most recent face recognition methods. The simplicity of our feature extraction procedure is demonstrated by the fact that no learning is involved (except PCA whitening). We show that, multi-level spatial pooling and dense extraction of multi-scale patches play critical roles in face image classification. The extracted facial features can capture strong structural information of individual faces with no label information being used. We also find that, pre-processing on local image patches such as contrast normalization can have an important impact on the classification accuracy. In particular, on the challenging face recognition datasets of FERET and LFW-a, our method improves previous best results by more than 10% and 20%, respectively.Comment: 12 page

    Generic Image Classification Approaches Excel on Face Recognition

    Full text link
    The main finding of this work is that the standard image classification pipeline, which consists of dictionary learning, feature encoding, spatial pyramid pooling and linear classification, outperforms all state-of-the-art face recognition methods on the tested benchmark datasets (we have tested on AR, Extended Yale B, the challenging FERET, and LFW-a datasets). This surprising and prominent result suggests that those advances in generic image classification can be directly applied to improve face recognition systems. In other words, face recognition may not need to be viewed as a separate object classification problem. While recently a large body of residual based face recognition methods focus on developing complex dictionary learning algorithms, in this work we show that a dictionary of randomly extracted patches (even from non-face images) can achieve very promising results using the image classification pipeline. That means, the choice of dictionary learning methods may not be important. Instead, we find that learning multiple dictionaries using different low-level image features often improve the final classification accuracy. Our proposed face recognition approach offers the best reported results on the widely-used face recognition benchmark datasets. In particular, on the challenging FERET and LFW-a datasets, we improve the best reported accuracies in the literature by about 20% and 30% respectively.Comment: 10 page

    HEp-2 Cell Classification via Fusing Texture and Shape Information

    Full text link
    Indirect Immunofluorescence (IIF) HEp-2 cell image is an effective evidence for diagnosis of autoimmune diseases. Recently computer-aided diagnosis of autoimmune diseases by IIF HEp-2 cell classification has attracted great attention. However the HEp-2 cell classification task is quite challenging due to large intra-class variation and small between-class variation. In this paper we propose an effective and efficient approach for the automatic classification of IIF HEp-2 cell image by fusing multi-resolution texture information and richer shape information. To be specific, we propose to: a) capture the multi-resolution texture information by a novel Pairwise Rotation Invariant Co-occurrence of Local Gabor Binary Pattern (PRICoLGBP) descriptor, b) depict the richer shape information by using an Improved Fisher Vector (IFV) model with RootSIFT features which are sampled from large image patches in multiple scales, and c) combine them properly. We evaluate systematically the proposed approach on the IEEE International Conference on Pattern Recognition (ICPR) 2012, IEEE International Conference on Image Processing (ICIP) 2013 and ICPR 2014 contest data sets. The experimental results for the proposed methods significantly outperform the winners of ICPR 2012 and ICIP 2013 contest, and achieve comparable performance with the winner of the newly released ICPR 2014 contest.Comment: 11 pages, 7 figure

    Automatic Facial Expression Recognition Using Features of Salient Facial Patches

    Full text link
    Extraction of discriminative features from salient facial patches plays a vital role in effective facial expression recognition. The accurate detection of facial landmarks improves the localization of the salient patches on face images. This paper proposes a novel framework for expression recognition by using appearance features of selected facial patches. A few prominent facial patches, depending on the position of facial landmarks, are extracted which are active during emotion elicitation. These active patches are further processed to obtain the salient patches which contain discriminative features for classification of each pair of expressions, thereby selecting different facial patches as salient for different pair of expression classes. One-against-one classification method is adopted using these features. In addition, an automated learning-free facial landmark detection technique has been proposed, which achieves similar performances as that of other state-of-art landmark detection methods, yet requires significantly less execution time. The proposed method is found to perform well consistently in different resolutions, hence, providing a solution for expression recognition in low resolution images. Experiments on CK+ and JAFFE facial expression databases show the effectiveness of the proposed system

    Large-scale Supervised Hierarchical Feature Learning for Face Recognition

    Full text link
    This paper proposes a novel face recognition algorithm based on large-scale supervised hierarchical feature learning. The approach consists of two parts: hierarchical feature learning and large-scale model learning. The hierarchical feature learning searches feature in three levels of granularity in a supervised way. First, face images are modeled by receptive field theory, and the representation is an image with many channels of Gaussian receptive maps. We activate a few most distinguish channels by supervised learning. Second, the face image is further represented by patches of picked channels, and we search from the over-complete patch pool to activate only those most discriminant patches. Third, the feature descriptor of each patch is further projected to lower dimension subspace with discriminant subspace analysis. Learned feature of activated patches are concatenated to get a full face representation.A linear classifier is learned to separate face pairs from same subjects and different subjects. As the number of face pairs are extremely large, we introduce ADMM (alternative direction method of multipliers) to train the linear classifier on a computing cluster. Experiments show that more training samples will bring notable accuracy improvement. We conduct experiments on FRGC and LFW. Results show that the proposed approach outperforms existing algorithms under the same protocol notably. Besides, the proposed approach is small in memory footprint, and low in computing cost, which makes it suitable for embedded applications.Comment: 8 pages; 3 figure

    Cross-pose Face Recognition by Canonical Correlation Analysis

    Full text link
    The pose problem is one of the bottlenecks in automatic face recognition. We argue that one of the diffculties in this problem is the severe misalignment in face images or feature vectors with different poses. In this paper, we propose that this problem can be statistically solved or at least mitigated by maximizing the intra-subject across-pose correlations via canonical correlation analysis (CCA). In our method, based on the data set with coupled face images of the same identities and across two different poses, CCA learns simultaneously two linear transforms, each for one pose. In the transformed subspace, the intra-subject correlations between the different poses are maximized, which implies pose-invariance or pose-robustness is achieved. The experimental results show that our approach could considerably improve the recognition performance. And if further enhanced with holistic+local feature representation, the performance could be comparable to the state-of-the-art

    Shape Primitive Histogram: A Novel Low-Level Face Representation for Face Recognition

    Full text link
    We further exploit the representational power of Haar wavelet and present a novel low-level face representation named Shape Primitives Histogram (SPH) for face recognition. Since human faces exist abundant shape features, we address the face representation issue from the perspective of the shape feature extraction. In our approach, we divide faces into a number of tiny shape fragments and reduce these shape fragments to several uniform atomic shape patterns called Shape Primitives. A convolution with Haar Wavelet templates is applied to each shape fragment to identify its belonging shape primitive. After that, we do a histogram statistic of shape primitives in each spatial local image patch for incorporating the spatial information. Finally, each face is represented as a feature vector via concatenating all the local histograms of shape primitives. Four popular face databases, namely ORL, AR, Yale-B and LFW-a databases, are employed to evaluate SPH and experimentally study the choices of the parameters. Extensive experimental results demonstrate that the proposed approach outperform the state-of-the-arts.Comment: second version, two columns and 11 page

    Vision-based Human Gender Recognition: A Survey

    Full text link
    Gender is an important demographic attribute of people. This paper provides a survey of human gender recognition in computer vision. A review of approaches exploiting information from face and whole body (either from a still image or gait sequence) is presented. We highlight the challenges faced and survey the representative methods of these approaches. Based on the results, good performance have been achieved for datasets captured under controlled environments, but there is still much work that can be done to improve the robustness of gender recognition under real-life environments.Comment: 30 page
    corecore