Search CORE

2,013 research outputs found

Unsupervised Domain Adaptation for Face Recognition in Unlabeled Videos

Author: Chandraker Manmohan
Liu Sifei
Sohn Kihyuk
Yang Ming-Hsuan
Yu Xiang
Zhong Guangyu
Publication venue
Publication date: 07/08/2017
Field of study

Despite rapid advances in face recognition, there remains a clear gap between the performance of still image-based face recognition and video-based face recognition, due to the vast difference in visual quality between the domains and the difficulty of curating diverse large-scale video datasets. This paper addresses both of those challenges, through an image to video feature-level domain adaptation approach, to learn discriminative video frame representations. The framework utilizes large-scale unlabeled video data to reduce the gap between different domains while transferring discriminative knowledge from large-scale labeled still images. Given a face recognition network that is pretrained in the image domain, the adaptation is achieved by (i) distilling knowledge from the network to a video adaptation network through feature matching, (ii) performing feature restoration through synthetic data augmentation and (iii) learning a domain-invariant feature through a domain adversarial discriminator. We further improve performance through a discriminator-guided feature fusion that boosts high-quality frames while eliminating those degraded by video domain-specific factors. Experiments on the YouTube Faces and IJB-A datasets demonstrate that each module contributes to our feature-level domain adaptation framework and substantially improves video face recognition performance to achieve state-of-the-art accuracy. We demonstrate qualitatively that the network learns to suppress diverse artifacts in videos such as pose, illumination or occlusion without being explicitly trained for them.Comment: accepted for publication at International Conference on Computer Vision (ICCV) 201

arXiv.org e-Print Archive

Crossref

Recommended from our members

Improving Visual Recognition With Unlabeled Data

Author: Roy Chowdhury Aruni
Publication venue: ScholarWorks@UMass Amherst
Publication date: 16/07/2020
Field of study

The success of deep neural networks has resulted in computer vision systems that obtain high accuracy on a wide variety of tasks such as image classification, object detection, semantic segmentation, etc. However, most state-of-the-art vision systems are dependent upon large amounts of labeled training data, which is not a scalable solution in the long run. This work focuses on improving existing models for visual object recognition and detection without being dependent on such large-scale human-annotated data. We first show how large numbers of hard examples (cases where an existing model makes a mistake) can be obtained automatically from unlabeled video sequences by exploiting temporal consistency cues in the output of a pre-trained object detector. These examples can strongly influence a model\u27s parameters when the network is re-trained to correct them, resulting in improved performance on several object detection tasks. Further, such hard examples from unlabeled videos can be used to address the problem of unsupervised domain adaptation. We focus on the automatic adaptation of an existing object detector to a new domain with no labeled data, assuming that a large number of unlabeled videos are readily available. Our approach is evaluated on challenging face and pedestrian detection tasks involving large domain shifts, showing improved performance with minimal dependence on hyper-parameters. Finally, we address the problem of face recognition, which has achieved high accuracy by employing deep neural networks trained on massive labeled datasets. Further improvements through supervised learning require significantly larger datasets and hence massive annotation efforts. We improve upon the performance of face recognition models trained on large-scale labeled datasets by using unlabeled faces as additional training data. We present insights and recipes for training deep face recognition models with labeled and unlabeled data at scale, addressing real-world challenges such as overlapping identities between the labeled and unlabeled datasets, as well as label noise introduced by clustering errors

ScholarWorks@UMass Amherst