297 research outputs found

    Gradient edge map features for frontal face recognition under extreme illumination changes

    Full text link
    Our aim in this paper is to robustly match frontal faces in the presence of extreme illumination changes, using only a single training image per person and a single probe image. In the illumination conditions we consider, which include those with the dominant light source placed behind and to the side of the user, directly above and pointing downwards or indeed below and pointing upwards, this is a most challenging problem. The presence of sharp cast shadows, large poorly illuminated regions of the face, quantum and quantization noise and other nuisance effects, makes it difficult to extract a sufficiently discriminative yet robust representation. We introduce a representation which is based on image gradient directions near robust edges which correspond to characteristic facial features. Robust edges are extracted using a cascade of processing steps, each of which seeks to harness further discriminative information or normalize for a particular source of extra-personal appearance variability. The proposed representation was evaluated on the extremely difficult YaleB data set. Unlike most of the previous work we include all available illuminations, perform training using a single image per person and match these also to a single probe image. In this challenging evaluation setup, the proposed gradient edge map achieved 0.8% error rate, demonstrating a nearly perfect receiver-operator characteristic curve behaviour. This is by far the best performance achieved in this setup reported in the literature, the best performing methods previously proposed attaining error rates of approximately 6–7%

    Learnt quasi-transitive similarity for retrieval from large collections of faces

    Get PDF
    We are interested in identity-based retrieval of face sets from large unlabelled collections acquired in uncontrolled environments. Given a baseline algorithm for measuring the similarity of two face sets, the meta-algorithm introduced in this paper seeks to leverage the structure of the data corpus to make the best use of the available baseline. In particular, we show how partial transitivity of inter-personal similarity can be exploited to improve the retrieval of particularly challenging sets which poorly match the query under the baseline measure. We: (i) describe the use of proxy sets as a means of computing the similarity between two sets, (ii) introduce transitivity meta-features based on the similarity of salient modes of appearance variation between sets, (iii) show how quasi-transitivity can be learnt from such features without any labelling or manual intervention, and (iv) demonstrate the effectiveness of the proposed methodology through experiments on the notoriously challenging YouTube database.Postprin

    Matching objects across the textured-smooth continuum

    Full text link
    The problem of 3D object recognition is of immense practical importance, with the last decade witnessing a number of breakthroughs in the state of the art. Most of the previous work has focused on the matching of textured objects using local appearance descriptors extracted around salient image points. The recently proposed bag of boundaries method was the first to address directly the problem of matching smooth objects using boundary features. However, no previous work has attempted to achieve a holistic treatment of the problem by jointly using textural and shape features which is what we describe herein. Due to the complementarity of the two modalities, we fuse the corresponding matching scores and learn their relative weighting in a data specific manner by optimizing discriminative performance on synthetically distorted data. For the textural description of an object we adopt a representation in the form of a histogram of SIFT based visual words. Similarly the apparent shape of an object is represented by a histogram of discretized features capturing local shape. On a large public database of a diverse set of objects, the proposed method is shown to outperform significantly both purely textural and purely shape based approaches for matching across viewpoint variation

    Taming Wild Faces: Web-Scale, Open-Universe Face Identification in Still and Video Imagery

    Get PDF
    With the increasing pervasiveness of digital cameras, the Internet, and social networking, there is a growing need to catalog and analyze large collections of photos and videos. In this dissertation, we explore unconstrained still-image and video-based face recognition in real-world scenarios, e.g. social photo sharing and movie trailers, where people of interest are recognized and all others are ignored. In such a scenario, we must obtain high precision in recognizing the known identities, while accurately rejecting those of no interest. Recent advancements in face recognition research has seen Sparse Representation-based Classification (SRC) advance to the forefront of competing methods. However, its drawbacks, slow speed and sensitivity to variations in pose, illumination, and occlusion, have hindered its wide-spread applicability. The contributions of this dissertation are three-fold: 1. For still-image data, we propose a novel Linearly Approximated Sparse Representation-based Classification (LASRC) algorithm that uses linear regression to perform sample selection for l1-minimization, thus harnessing the speed of least-squares and the robustness of SRC. On our large dataset collected from Facebook, LASRC performs equally to standard SRC with a speedup of 100-250x. 2. For video, applying the popular l1-minimization for face recognition on a frame-by-frame basis is prohibitively expensive computationally, so we propose a new algorithm Mean Sequence SRC (MSSRC) that performs video face recognition using a joint optimization leveraging all of the available video data and employing the knowledge that the face track frames belong to the same individual. Employing MSSRC results in a speedup of 5x on average over SRC on a frame-by-frame basis. 3. Finally, we make the observation that MSSRC sometimes assigns inconsistent identities to the same individual in a scene that could be corrected based on their visual similarity. Therefore, we construct a probabilistic affinity graph combining appearance and co-occurrence similarities to model the relationship between face tracks in a video. Using this relationship graph, we employ random walk analysis to propagate strong class predictions among similar face tracks, while dampening weak predictions. Our method results in a performance gain of 15.8% in average precision over using MSSRC alone

    Constrained low-rank representation for robust subspace clustering

    Get PDF
    Subspace clustering aims to partition the data points drawn from a union of subspaces according to their underlying subspaces. For accurate semi-supervised subspace clustering, all data that have a must-link constraint or the same label should be grouped into the same underlying subspace. However, this is not guaranteed in existing approaches. Moreover, these approaches require additional parameters for incorporating supervision information. In this paper, we propose a constrained low-rank representation (CLRR) for robust semi-supervised subspace clustering, based on a novel constraint matrix constructed in this paper. While seeking the low-rank representation of data, CLRR explicitly incorporates supervision information as hard constraints for enhancing the discriminating power of optimal representation. This strategy can be further extended to other state-of-the-art methods, such as sparse subspace clustering. We theoretically prove that the optimal representation matrix has both a block-diagonal structure with clean data and a semi-supervised grouping effect with noisy data. We have also developed an efficient optimization algorithm based on alternating the direction method of multipliers for CLRR. Our experimental results have demonstrated that CLRR outperforms existing methods

    Optical flow estimation via steered-L1 norm

    Get PDF
    Global variational methods for estimating optical flow are among the best performing methods due to the subpixel accuracy and the ‘fill-in’ effect they provide. The fill-in effect allows optical flow displacements to be estimated even in low and untextured areas of the image. The estimation of such displacements are induced by the smoothness term. The L1 norm provides a robust regularisation term for the optical flow energy function with a very good performance for edge-preserving. However this norm suffers from several issues, among these is the isotropic nature of this norm which reduces the fill-in effect and eventually the accuracy of estimation in areas near motion boundaries. In this paper we propose an enhancement to the L1 norm that improves the fill-in effect for this smoothness term. In order to do this we analyse the structure tensor matrix and use its eigenvectors to steer the smoothness term into components that are ‘orthogonal to’ and ‘aligned with’ image structures. This is done in primal-dual formulation. Results show a reduced end-point error and improved accuracy compared to the conventional L1 norm
    corecore