3,302 research outputs found

    Face Alignment Robust to Pose, Expressions and Occlusions

    Full text link
    We propose an Ensemble of Robust Constrained Local Models for alignment of faces in the presence of significant occlusions and of any unknown pose and expression. To account for partial occlusions we introduce, Robust Constrained Local Models, that comprises of a deformable shape and local landmark appearance model and reasons over binary occlusion labels. Our occlusion reasoning proceeds by a hypothesize-and-test search over occlusion labels. Hypotheses are generated by Constrained Local Model based shape fitting over randomly sampled subsets of landmark detector responses and are evaluated by the quality of face alignment. To span the entire range of facial pose and expression variations we adopt an ensemble of independent Robust Constrained Local Models to search over a discretized representation of pose and expression. We perform extensive evaluation on a large number of face images, both occluded and unoccluded. We find that our face alignment system trained entirely on facial images captured "in-the-lab" exhibits a high degree of generalization to facial images captured "in-the-wild". Our results are accurate and stable over a wide spectrum of occlusions, pose and expression variations resulting in excellent performance on many real-world face datasets

    Robust Facial Landmark Localization Based on Texture and Pose Correlated Initialization

    Full text link
    Robust facial landmark localization remains a challenging task when faces are partially occluded. Recently, the cascaded pose regression has attracted increasing attentions, due to it's superior performance in facial landmark localization and occlusion detection. However, such an approach is sensitive to initialization, where an improper initialization can severly degrade the performance. In this paper, we propose a Robust Initialization for Cascaded Pose Regression (RICPR) by providing texture and pose correlated initial shapes for the testing face. By examining the correlation of local binary patterns histograms between the testing face and the training faces, the shapes of the training faces that are most correlated with the testing face are selected as the texture correlated initialization. To make the initialization more robust to various poses, we estimate the rough pose of the testing face according to five fiducial landmarks located by multitask cascaded convolutional networks. Then the pose correlated initial shapes are constructed by the mean face's shape and the rough testing face pose. Finally, the texture correlated and the pose correlated initial shapes are joined together as the robust initialization. We evaluate RICPR on the challenging dataset of COFW. The experimental results demonstrate that the proposed scheme achieves better performances than the state-of-the-art methods in facial landmark localization and occlusion detection

    Deep Multi-Center Learning for Face Alignment

    Full text link
    Facial landmarks are highly correlated with each other since a certain landmark can be estimated by its neighboring landmarks. Most of the existing deep learning methods only use one fully-connected layer called shape prediction layer to estimate the locations of facial landmarks. In this paper, we propose a novel deep learning framework named Multi-Center Learning with multiple shape prediction layers for face alignment. In particular, each shape prediction layer emphasizes on the detection of a certain cluster of semantically relevant landmarks respectively. Challenging landmarks are focused firstly, and each cluster of landmarks is further optimized respectively. Moreover, to reduce the model complexity, we propose a model assembling method to integrate multiple shape prediction layers into one shape prediction layer. Extensive experiments demonstrate that our method is effective for handling complex occlusions and appearance variations with real-time performance. The code for our method is available at https://github.com/ZhiwenShao/MCNet-Extension.Comment: This paper has been accepted by Neurocomputin

    A Detailed Look At CNN-based Approaches In Facial Landmark Detection

    Full text link
    Facial landmark detection has been studied over decades. Numerous neural network (NN)-based approaches have been proposed for detecting landmarks, especially the convolutional neural network (CNN)-based approaches. In general, CNN-based approaches can be divided into regression and heatmap approaches. However, no research systematically studies the characteristics of different approaches. In this paper, we investigate both CNN-based approaches, generalize their advantages and disadvantages, and introduce a variation of the heatmap approach, a pixel-wise classification (PWC) model. To the best of our knowledge, using the PWC model to detect facial landmarks have not been comprehensively studied. We further design a hybrid loss function and a discrimination network for strengthening the landmarks' interrelationship implied in the PWC model to improve the detection accuracy without modifying the original model architecture. Six common facial landmark datasets, AFW, Helen, LFPW, 300-W, IBUG, and COFW are adopted to train or evaluate our model. A comprehensive evaluation is conducted and the result shows that the proposed model outperforms other models in all tested datasets

    Deep Regression for Face Alignment

    Full text link
    In this paper, we present a deep regression approach for face alignment. The deep architecture consists of a global layer and multi-stage local layers. We apply the back-propagation algorithm with the dropout strategy to jointly optimize the regression parameters. We show that the resulting deep regressor gradually and evenly approaches the true facial landmarks stage by stage, avoiding the tendency to yield over-strong early stage regressors while over-weak later stage regressors. Experimental results show that our approach achieves the state-of-the-ar

    A Convolution Tree with Deconvolution Branches: Exploiting Geometric Relationships for Single Shot Keypoint Detection

    Full text link
    Recently, Deep Convolution Networks (DCNNs) have been applied to the task of face alignment and have shown potential for learning improved feature representations. Although deeper layers can capture abstract concepts like pose, it is difficult to capture the geometric relationships among the keypoints in DCNNs. In this paper, we propose a novel convolution-deconvolution network for facial keypoint detection. Our model predicts the 2D locations of the keypoints and their individual visibility along with 3D head pose, while exploiting the spatial relationships among different keypoints. Different from existing approaches of modeling these relationships, we propose learnable transform functions which captures the relationships between keypoints at feature level. However, due to extensive variations in pose, not all of these relationships act at once, and hence we propose, a pose-based routing function which implicitly models the active relationships. Both transform functions and the routing function are implemented through convolutions in a multi-task framework. Our approach presents a single-shot keypoint detection method, making it different from many existing cascade regression-based methods. We also show that learning these relationships significantly improve the accuracy of keypoint detections for in-the-wild face images from challenging datasets such as AFW and AFLW

    Automatic Facial Expression Recognition Using Features of Salient Facial Patches

    Full text link
    Extraction of discriminative features from salient facial patches plays a vital role in effective facial expression recognition. The accurate detection of facial landmarks improves the localization of the salient patches on face images. This paper proposes a novel framework for expression recognition by using appearance features of selected facial patches. A few prominent facial patches, depending on the position of facial landmarks, are extracted which are active during emotion elicitation. These active patches are further processed to obtain the salient patches which contain discriminative features for classification of each pair of expressions, thereby selecting different facial patches as salient for different pair of expression classes. One-against-one classification method is adopted using these features. In addition, an automated learning-free facial landmark detection technique has been proposed, which achieves similar performances as that of other state-of-art landmark detection methods, yet requires significantly less execution time. The proposed method is found to perform well consistently in different resolutions, hence, providing a solution for expression recognition in low resolution images. Experiments on CK+ and JAFFE facial expression databases show the effectiveness of the proposed system

    Facial Landmark Detection: a Literature Survey

    Full text link
    The locations of the fiducial facial landmark points around facial components and facial contour capture the rigid and non-rigid facial deformations due to head movements and facial expressions. They are hence important for various facial analysis tasks. Many facial landmark detection algorithms have been developed to automatically detect those key points over the years, and in this paper, we perform an extensive review of them. We classify the facial landmark detection algorithms into three major categories: holistic methods, Constrained Local Model (CLM) methods, and the regression-based methods. They differ in the ways to utilize the facial appearance and shape information. The holistic methods explicitly build models to represent the global facial appearance and shape information. The CLMs explicitly leverage the global shape model but build the local appearance models. The regression-based methods implicitly capture facial shape and appearance information. For algorithms within each category, we discuss their underlying theories as well as their differences. We also compare their performances on both controlled and in the wild benchmark datasets, under varying facial expressions, head poses, and occlusion. Based on the evaluations, we point out their respective strengths and weaknesses. There is also a separate section to review the latest deep learning-based algorithms. The survey also includes a listing of the benchmark databases and existing software. Finally, we identify future research directions, including combining methods in different categories to leverage their respective strengths to solve landmark detection "in-the-wild"

    Learning Deep Representation for Face Alignment with Auxiliary Attributes

    Full text link
    In this study, we show that landmark detection or face alignment task is not a single and independent problem. Instead, its robustness can be greatly improved with auxiliary information. Specifically, we jointly optimize landmark detection together with the recognition of heterogeneous but subtly correlated facial attributes, such as gender, expression, and appearance attributes. This is non-trivial since different attribute inference tasks have different learning difficulties and convergence rates. To address this problem, we formulate a novel tasks-constrained deep model, which not only learns the inter-task correlation but also employs dynamic task coefficients to facilitate the optimization convergence when learning multiple complex tasks. Extensive evaluations show that the proposed task-constrained learning (i) outperforms existing face alignment methods, especially in dealing with faces with severe occlusion and pose variation, and (ii) reduces model complexity drastically compared to the state-of-the-art methods based on cascaded deep model.Comment: to be published in the IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI

    DeCaFA: Deep Convolutional Cascade for Face Alignment In The Wild

    Full text link
    Face Alignment is an active computer vision domain, that consists in localizing a number of facial landmarks that vary across datasets. State-of-the-art face alignment methods either consist in end-to-end regression, or in refining the shape in a cascaded manner, starting from an initial guess. In this paper, we introduce DeCaFA, an end-to-end deep convolutional cascade architecture for face alignment. DeCaFA uses fully-convolutional stages to keep full spatial resolution throughout the cascade. Between each cascade stage, DeCaFA uses multiple chained transfer layers with spatial softmax to produce landmark-wise attention maps for each of several landmark alignment tasks. Weighted intermediate supervision, as well as efficient feature fusion between the stages allow to learn to progressively refine the attention maps in an end-to-end manner. We show experimentally that DeCaFA significantly outperforms existing approaches on 300W, CelebA and WFLW databases. In addition, we show that DeCaFA can learn fine alignment with reasonable accuracy from very few images using coarsely annotated data
    • …
    corecore