Search CORE

8,730 research outputs found

Learning Convolutional Neural Network For Face Verification

Author: Rashedi Elaheh
Publication venue: DigitalCommons@WayneState
Publication date: 01/01/2018
Field of study

Convolutional neural networks (ConvNet) have improved the state of the art in many applications. Face recognition tasks, for example, have seen a significantly improved performance due to ConvNets. However, less attention has been given to video-based face recognition. Here, we make three contributions along these lines. First, we proposed a ConvNet-based system for long-term face tracking from videos. Through taking advantage of pre-trained deep learning models on big data, we developed a novel system for accurate video face tracking in the unconstrained environments depicting various people and objects moving in and out of the frame. In the proposed system, we presented a Detection-Verification-Tracking method (DVT) which accomplishes the long-term face tracking task through the collaboration of face detection, face verification, and (short-term) face tracking. An online trained detector based on cascaded convolutional neural networks localizes all faces appeared in the frames, and an online trained face verifier based on deep convolutional neural networks and similarity metric learning decides if any face or which face corresponds to the query person. An online trained tracker follows the face from frame to frame. When validated on a sitcom episode and a TV show, the DVT method outperforms tracking-learning-detection (TLD) and face-TLD in terms of recall and precision. The proposed system is tested on many other types of videos and shows very promising results. Secondly, as the availability of large-scale training dataset has a significant effect on the performance of ConvNet-based recognition methods, we presented a successful automatic video collection approach to generate a large-scale video training dataset. We designed a procedure for generating a face verification dataset from videos based on the long-term face tracking algorithm, DVT. In this procedure, the streams can be collected from videos, and labeled automatically without human annotation intervention. Using this procedure, we assembled a widely scalable dataset, FaceSequence. FaceSequence includes 1.5M streams capturing ~500K individuals. A key distinction between this dataset and the existing video datasets is that FaceSequence is generated from publicly available videos and labeled automatically, hence widely scalable at no annotation cost. Lastly, we introduced a stream-based ConvNet architecture for video face verification task. The proposed network is designed to optimize the differentiable error function, referred to as stream loss, using unlabeled temporal face sequences. Using the unlabeled video dataset, FaceSequence, we trained our network to minimize the stream loss. The network achieves verification accuracy comparable to the state of the art on the LFW and YTF datasets with much smaller model complexity. In comparison to VGG, our method demonstrates a significant improvement in TAR/FAR, considering the fact that the VGG dataset is highly puried and includes a small label noise. We also fine-tuned the network using the IJB-A dataset. The validation results show competitive verifiation accuracy compared with the best previous video face verification results

Digital Commons@Wayne State University

CIAGAN: Conditional Identity Anonymization Generative Adversarial Networks

Author: Elezi Ismail
Leal-Taixé Laura
Maximov Maxim
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 30/11/2020
Field of study

The unprecedented increase in the usage of computer vision technology in society goes hand in hand with an increased concern in data privacy. In many real-world scenarios like people tracking or action recognition, it is important to be able to process the data while taking careful consideration in protecting people's identity. We propose and develop CIAGAN, a model for image and video anonymization based on conditional generative adversarial networks. Our model is able to remove the identifying characteristics of faces and bodies while producing high-quality images and videos that can be used for any computer vision task, such as detection or tracking. Unlike previous methods, we have full control over the de-identification (anonymization) procedure, ensuring both anonymization as well as diversity. We compare our method to several baselines and achieve state-of-the-art results.Comment: CVPR 202

arXiv.org e-Print Archive

Crossref