1,220 research outputs found

    Deep Learning Face Attributes in the Wild

    Full text link
    Predicting face attributes in the wild is challenging due to complex face variations. We propose a novel deep learning framework for attribute prediction in the wild. It cascades two CNNs, LNet and ANet, which are fine-tuned jointly with attribute tags, but pre-trained differently. LNet is pre-trained by massive general object categories for face localization, while ANet is pre-trained by massive face identities for attribute prediction. This framework not only outperforms the state-of-the-art with a large margin, but also reveals valuable facts on learning face representation. (1) It shows how the performances of face localization (LNet) and attribute prediction (ANet) can be improved by different pre-training strategies. (2) It reveals that although the filters of LNet are fine-tuned only with image-level attribute tags, their response maps over entire images have strong indication of face locations. This fact enables training LNet for face localization with only image-level annotations, but without face bounding boxes or landmarks, which are required by all attribute recognition works. (3) It also demonstrates that the high-level hidden neurons of ANet automatically discover semantic concepts after pre-training with massive face identities, and such concepts are significantly enriched after fine-tuning with attribute tags. Each attribute can be well explained with a sparse linear combination of these concepts.Comment: To appear in International Conference on Computer Vision (ICCV) 201

    Multi-view Face Detection Using Deep Convolutional Neural Networks

    Full text link
    In this paper we consider the problem of multi-view face detection. While there has been significant research on this problem, current state-of-the-art approaches for this task require annotation of facial landmarks, e.g. TSM [25], or annotation of face poses [28, 22]. They also require training dozens of models to fully capture faces in all orientations, e.g. 22 models in HeadHunter method [22]. In this paper we propose Deep Dense Face Detector (DDFD), a method that does not require pose/landmark annotation and is able to detect faces in a wide range of orientations using a single model based on deep convolutional neural networks. The proposed method has minimal complexity; unlike other recent deep learning object detection methods [9], it does not require additional components such as segmentation, bounding-box regression, or SVM classifiers. Furthermore, we analyzed scores of the proposed face detector for faces in different orientations and found that 1) the proposed method is able to detect faces from different angles and can handle occlusion to some extent, 2) there seems to be a correlation between dis- tribution of positive examples in the training set and scores of the proposed face detector. The latter suggests that the proposed methods performance can be further improved by using better sampling strategies and more sophisticated data augmentation techniques. Evaluations on popular face detection benchmark datasets show that our single-model face detector algorithm has similar or better performance compared to the previous methods, which are more complex and require annotations of either different poses or facial landmarks.Comment: in International Conference on Multimedia Retrieval 2015 (ICMR

    Face Alignment Assisted by Head Pose Estimation

    Full text link
    In this paper we propose a supervised initialization scheme for cascaded face alignment based on explicit head pose estimation. We first investigate the failure cases of most state of the art face alignment approaches and observe that these failures often share one common global property, i.e. the head pose variation is usually large. Inspired by this, we propose a deep convolutional network model for reliable and accurate head pose estimation. Instead of using a mean face shape, or randomly selected shapes for cascaded face alignment initialisation, we propose two schemes for generating initialisation: the first one relies on projecting a mean 3D face shape (represented by 3D facial landmarks) onto 2D image under the estimated head pose; the second one searches nearest neighbour shapes from the training set according to head pose distance. By doing so, the initialisation gets closer to the actual shape, which enhances the possibility of convergence and in turn improves the face alignment performance. We demonstrate the proposed method on the benchmark 300W dataset and show very competitive performance in both head pose estimation and face alignment.Comment: Accepted by BMVC201

    Studies on Imaging System and Machine Learning: 3D Halftoning and Human Facial Landmark Localization

    Get PDF
    In this dissertation, studies on digital halftoning and human facial landmark localization will be discussed. 3D printing is becoming increasingly popular around the world today. By utilizing 3D printing technology, customized products can be manufactured much more quickly and efficiently with much less cost. However, 3D printing still suffers from low-quality surface reproduction compared with 2D printing. One approach to improve it is to develop an advanced halftoning algorithm for 3D printing. In this presentation, we will describe a novel method to 3D halftoning that can cooperate with 3D printing technology in order to generate a high-quality surface reproduction. In the second part of this report, a new method named direct element swap to create a threshold matrix for halftoning is proposed. This method directly swaps the elements in a threshold matrix to find the best element arrangement by minimizing a designated perceived error metric. Through experimental results, the new method yields halftone quality that is competitive with the conventional level-by-level matrix design method. Besides, by using direct element swap method, for the first time, threshold matrix can be designed through being trained with real images. In the second part of the dissertation, a novel facial landmark detection system is presented. Facial landmark detection plays a critical role in many face analysis tasks. However, it still remains a very challenging problem. The challenges come from the large variations of face appearance caused by different illuminations, different facial expressions, different yaw, pitch and roll angles of heads and different image qualities. To tackle this problem, a novel coarse-to-fine cascaded convolutional neural network system for robust facial landmark detection of faces in the wild is presented. The experiment result shows our method outperforms other state-of-the-art methods on public test datasets. Besides, a frontal and profile landmark localization system is proposed and designed. By using a frontal/profile face classifier, either frontal landmark configuration or profile landmark configuration is employed in the facial landmark prediction based on the input face yaw angle

    Fast Landmark Localization with 3D Component Reconstruction and CNN for Cross-Pose Recognition

    Full text link
    Two approaches are proposed for cross-pose face recognition, one is based on the 3D reconstruction of facial components and the other is based on the deep Convolutional Neural Network (CNN). Unlike most 3D approaches that consider holistic faces, the proposed approach considers 3D facial components. It segments a 2D gallery face into components, reconstructs the 3D surface for each component, and recognizes a probe face by component features. The segmentation is based on the landmarks located by a hierarchical algorithm that combines the Faster R-CNN for face detection and the Reduced Tree Structured Model for landmark localization. The core part of the CNN-based approach is a revised VGG network. We study the performances with different settings on the training set, including the synthesized data from 3D reconstruction, the real-life data from an in-the-wild database, and both types of data combined. We investigate the performances of the network when it is employed as a classifier or designed as a feature extractor. The two recognition approaches and the fast landmark localization are evaluated in extensive experiments, and compared to stateof-the-art methods to demonstrate their efficacy.Comment: 14 pages, 12 figures, 4 table
    • …
    corecore