5,262 research outputs found

    VGGFace2: A dataset for recognising faces across pose and age

    Full text link
    In this paper, we introduce a new large-scale face dataset named VGGFace2. The dataset contains 3.31 million images of 9131 subjects, with an average of 362.6 images for each subject. Images are downloaded from Google Image Search and have large variations in pose, age, illumination, ethnicity and profession (e.g. actors, athletes, politicians). The dataset was collected with three goals in mind: (i) to have both a large number of identities and also a large number of images for each identity; (ii) to cover a large range of pose, age and ethnicity; and (iii) to minimize the label noise. We describe how the dataset was collected, in particular the automated and manual filtering stages to ensure a high accuracy for the images of each identity. To assess face recognition performance using the new dataset, we train ResNet-50 (with and without Squeeze-and-Excitation blocks) Convolutional Neural Networks on VGGFace2, on MS- Celeb-1M, and on their union, and show that training on VGGFace2 leads to improved recognition performance over pose and age. Finally, using the models trained on these datasets, we demonstrate state-of-the-art performance on all the IARPA Janus face recognition benchmarks, e.g. IJB-A, IJB-B and IJB-C, exceeding the previous state-of-the-art by a large margin. Datasets and models are publicly available.Comment: This paper has been accepted by IEEE Conference on Automatic Face and Gesture Recognition (F&G), 2018. (Oral

    Suspect identification based on descriptive facial attributes

    Full text link
    We present a method for using human describable face attributes to perform face identification in criminal inves-tigations. To enable this approach, a set of 46 facial at-tributes were carefully defined with the goal of capturing all describable and persistent facial features. Using crowd sourced labor, a large corpus of face images were manually annotated with the proposed attributes. In turn, we train an automated attribute extraction algorithm to encode target repositories with the attribute information. Attribute extrac-tion is performed using localized face components to im-prove the extraction accuracy. Experiments are conducted to compare the use of attribute feature information, derived from crowd workers, to face sketch information, drawn by expert artists. In addition to removing the dependence on expert artists, the proposed method complements sketch-based face recognition by allowing investigators to imme-diately search face repositories without the time delay that is incurred due to sketch generation. 1

    The Devil of Face Recognition is in the Noise

    Full text link
    The growing scale of face recognition datasets empowers us to train strong convolutional networks for face recognition. While a variety of architectures and loss functions have been devised, we still have a limited understanding of the source and consequence of label noise inherent in existing datasets. We make the following contributions: 1) We contribute cleaned subsets of popular face databases, i.e., MegaFace and MS-Celeb-1M datasets, and build a new large-scale noise-controlled IMDb-Face dataset. 2) With the original datasets and cleaned subsets, we profile and analyze label noise properties of MegaFace and MS-Celeb-1M. We show that a few orders more samples are needed to achieve the same accuracy yielded by a clean subset. 3) We study the association between different types of noise, i.e., label flips and outliers, with the accuracy of face recognition models. 4) We investigate ways to improve data cleanliness, including a comprehensive user study on the influence of data labeling strategies to annotation accuracy. The IMDb-Face dataset has been released on https://github.com/fwang91/IMDb-Face.Comment: accepted to ECCV'1

    Drop Loss for Person Attribute Recognition With Imbalanced Noisy-Labeled Samples

    Get PDF
    Person attribute recognition (PAR) aims to simultaneously predict multiple attributes of a person. Existing deep learning-based PAR methods have achieved impressive performance. Unfortunately, these methods usually ignore the fact that different attributes have an imbalance in the number of noisy-labeled samples in the PAR training datasets, thus leading to suboptimal performance. To address the above problem of imbalanced noisy-labeled samples, we propose a novel and effective loss called drop loss for PAR. In the drop loss, the attributes are treated differently in an easy-to-hard way. In particular, the noisy-labeled candidates, which are identified according to their gradient norms, are dropped with a higher drop rate for the harder attribute. Such a manner adaptively alleviates the adverse effect of imbalanced noisy-labeled samples on model learning. To illustrate the effectiveness of the proposed loss, we train a simple ResNet-50 model based on the drop loss and term it DropNet. Experimental results on two representative PAR tasks (including facial attribute recognition and pedestrian attribute recognition) demonstrate that the proposed DropNet achieves comparable or better performance in terms of both balanced accuracy and classification accuracy over several state-of-the-art PAR methods
    • …
    corecore