9 research outputs found

    Towards Pose-Invariant 2D Face Classification for Surveillance

    Get PDF
    A key problem for "face in the crowd" recognition from existing surveillance cameras in public spaces (such as mass transit centres) is the issue of pose mismatches between probe and gallery faces. In addition to accuracy, scalability is also important, necessarily limiting the complexity of face classification algorithms. In this paper we evaluate recent approaches to the recognition of faces at relatively large pose angles from a gallery of frontal images and propose novel adaptations as well as modifications. Specifically, we compare and contrast the accuracy, robustness and speed of an Active Appearance Model (AAM) based method (where realistic frontal faces are synthesized from non-frontal probe faces) against bag-of-features methods (which are local feature approaches based on block Discrete Cosine Transforms and Gaussian Mixture Models). We show a novel approach where the AAM based technique is sped up by directly obtaining pose-robust features, allowing the omission of the computationally expensive and artefact producing image synthesis step. Additionally, we adapt a histogram-based bag-of-features technique to face classification and contrast its properties to a previously proposed direct bag-of-features method. We also show that the two bag-of-features approaches can be considerably sped up, without a loss in classification accuracy, via an approximation of the exponential function. Experiments on the FERET and PIE databases suggest that the bag-of-features techniques generally attain better performance, with significantly lower computational loads. The histogram-based bag-of-features technique is capable of achieving an average recognition accuracy of 89% for pose angles of around 25 degrees


    Full text link

    Face recognition with the RGB-D sensor

    Get PDF
    Face recognition in unconstrained environments is still a challenge, because of the many variations of the facial appearance due to changes in head pose, lighting conditions, facial expression, age, etc. This work addresses the problem of face recognition in the presence of 2D facial appearance variations caused by 3D head rotations. It explores the advantages of the recently developed consumer-level RGB-D cameras (e.g. Kinect). These cameras provide color and depth images at the same rate. They are affordable and easy to use, but the depth images are noisy and in low resolution, unlike laser scanned depth images. The proposed approach to face recognition is able to deal with large head pose variations using RGB-D face images. The method uses the depth information to correct the pose of the face. It does not need to learn a generic face model or make complex 3D-2D registrations. It is simple and fast, yet able to deal with large pose variations and perform pose-invariant face recognition. Experiments on a public database show that the presented approach is effective and efficient under significant pose changes. Also, the idea is used to develop a face recognition software that is able to achieve real-time face recognition in the presence of large yaw rotations using the Kinect sensor. It is shown in real-time how this method improves recognition accuracy and confidence level. This study demonstrates that RGB-D sensors are a promising tool that can lead to the development of robust pose-invariant face recognition systems under large pose variations

    Learning patch dependencies for improved pose mismatched face verification

    No full text
    10.1109/CVPR.2006.172Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition1909-915PIVR

    Learning Patch Dependencies for Improved Pose Mismatched Face Verification

    No full text

    Learning Patch Dependencies for Improved Pose Mismatched Face Verification

    No full text
    Most pose robust face verification algorithms, which employ 2D appearance, rely heavily on statistics gathered from offline databases containing ample facial appearance variation across many views. Due to the high dimensionality of the face images being employed, the validity of the assumptions employed in obtaining these statistics are essential for good performance. In this paper we assess three common approaches in 2D appearance pose mismatched face recognition literature. In our experiments we demonstrate where these approaches work and fail. As a result of this analysis, we additionally propose a new algorithm that attempts to learn the statistical dependency between gallery patches (i.e. local regions of pixels) and the whole appearance of the probe image. We demonstrate improved performance over a number of leading 2D appearance face recognition algorithms. 1

    Face recognition in uncontrolled environments

    Get PDF
    This thesis concerns face recognition in uncontrolled environments in which the images used for training and test are collected from the real world instead of laboratories. Compared with controlled environments, images from uncontrolled environments contain more variation in pose, lighting, expression, occlusion, background, image quality, scale, and makeup. Therefore, face recognition in uncontrolled environments is much more challenging than in controlled conditions. Moreover, many real world applications require good recognition performance in uncontrolled environments. Example applications include social networking, human-computer interaction and electronic entertainment. Therefore, researchers and companies have shifted their interest from controlled environments to uncontrolled environments over the past seven years. In this thesis, we divide the history of face recognition into four stages and list the main problems and algorithms at each stage. We find that face recognition in unconstrained environments is still an unsolved problem although many face recognition algorithms have been proposed in the last decade. Existing approaches have two major limitations. First, many methods do not perform well when tested in uncontrolled databases even when all the faces are close to frontal. Second, most current algorithms cannot handle large pose variation, which has become a bottleneck for improving performance. In this thesis, we investigate Bayesian models for face recognition. Our contributions extend Probabilistic Linear Discriminant Analysis (PLDA) [Prince and Elder 2007]. In PLDA, images are described as a sum of signal and noise components. Each component is a weighted combination of basis functions. We firstly investigate the effect of degree of the localization of these basis functions and find better performance is obtained when the signal is treated more locally and the noise more globally. We call this new algorithm multi-scale PLDA and our experiments show it can handle lighting variation better than PLDA but fails for pose variation. We then analyze three existing Bayesian face recognition algorithms and combine the advantages of PLDA and the Joint Bayesian Face algorithm [Chen et al. 2012] to propose Joint PLDA. We find that our new algorithm improves performance compared to existing Bayesian face recognition algorithms. Finally, we propose Tied Joint Bayesian Face algorithm and Tied Joint PLDA to address large pose variations in the data, which drastically decreases performance in most existing face recognition algorithms. To provide sufficient training images with large pose difference, we introduce a new database called the UCL Multi-pose database. We demonstrate that our Bayesian models improve face recognition performance when the pose of the face images varies

    Patch-based models for visual object classes

    Get PDF
    This thesis concerns models for visual object classes that exhibit a reasonable amount of regularity, such as faces, pedestrians, cells and human brains. Such models are useful for making “within-object” inferences such as determining their individual characteristics and establishing their identity. For example, the model could be used to predict the identity of a face, the pose of a pedestrian or the phenotype of a cell and segment parts of a human brain. Existing object modelling techniques have several limitations. First, most current methods have targeted the above tasks individually using object specific representations; therefore, they cannot be applied to other problems without major alterations. Second, most methods have been designed to work with small databases which do not contain the variations in pose, illumination, occlusion and background clutter seen in ‘real world’ images. Consequently, many existing algorithms fail when tested on unconstrained databases. Finally, the complexity of the training procedure in these methods makes it impractical to use large datasets. In this thesis, we investigate patch-based models for object classes. Our models are capable of exploiting very large databases of objects captured in uncontrolled environments. We represent the test image with a regular grid of patches from a library of images of the same object. All the domain specific information is held in this library: we use one set of images of the object to help draw inferences about others. In each experimental chapter we investigate a different within-object inference task. In particular we develop models for classification, regression, semantic segmentation and identity recognition. In each task, we achieve results that are comparable to or better than the state of the art. We conclude that patch-based representation can be successfully used for the above tasks and shows promise for other applications such as generation and localization