177,473 research outputs found

    Face Detection in Complex Natural Scenes

    Get PDF
    Face detection is an important preliminary process for all other tasks with faces, such as expression analysis and person identification. It is also known to be rapid and automatic, which indicates that detection might utilise low-level visual information. It has been suggested that this consist of a ‘skin-coloured, face-shaped template’, while internal facial features, such as the eyes, nose and mouth might also help to optimise performance. To explore these ideas directly, this thesis first examined how shape and features are integrated into a detection template (Chapter 2). For this purpose, face content was isolated into three ranges of spatial frequency, comprising low (LSF), mid (MSF) and high (HSF) frequencies. Detection performance in these conditions was always compared with an original condition, which displayed unfiltered images in the full range of spatial frequency. Across five behavioural and eye-tracking experiments, detection was best for the original condition, followed by MSF, LSF and HSF faces. LSF faces, which provide only crude visual detail (i.e. gross colour shape), were detected as quickly as MSF faces but less accurate. In addition, LSF faces showed a clear advantage over HSF, which contains fine visual information (i.e. detailed lines of the eyes, nose, and mouth), in terms of detection speed and accuracy. These findings indicate that face detection is driven by simple information, such as the saliency of colour and shape, which supports the notion of a skin-coloured faceshape template. However, the fast and more accurate performance for faces in the full and mid-spatial frequencies also indicates that facial features contribute to optimize detection. In Chapter 3, three further eye-tracking experiments are reported, which explore further whether the height-to-width ratio of a coloured-shape template might be important for detection. Performance was best when faces’ natural height-to-width ratios were preserved compared to vertically and horizontally stretched faces. This indicates that this is an important element of the cognitive template for face template. The results also highlight that face detection differs from face recognition, which tolerates the same type of geometric disruption. Based on the results of Chapter 2 and 3, a model of face detection is proposed in Chapter 4. In this model, colour face-shape and features drive detection in parallel, but not necessarily at equal speed, in a “horse race”. Accordingly, rapid detection is normally driven by salient colour and shape cues that preserve the height-to-width ratio of faces, but finer visual detail from features can facilitate this process when further information is needed

    The computer nose best

    Get PDF

    Self-supervised Multi-level Face Model Learning for Monocular Reconstruction at over 250 Hz

    Full text link
    The reconstruction of dense 3D models of face geometry and appearance from a single image is highly challenging and ill-posed. To constrain the problem, many approaches rely on strong priors, such as parametric face models learned from limited 3D scan data. However, prior models restrict generalization of the true diversity in facial geometry, skin reflectance and illumination. To alleviate this problem, we present the first approach that jointly learns 1) a regressor for face shape, expression, reflectance and illumination on the basis of 2) a concurrently learned parametric face model. Our multi-level face model combines the advantage of 3D Morphable Models for regularization with the out-of-space generalization of a learned corrective space. We train end-to-end on in-the-wild images without dense annotations by fusing a convolutional encoder with a differentiable expert-designed renderer and a self-supervised training loss, both defined at multiple detail levels. Our approach compares favorably to the state-of-the-art in terms of reconstruction quality, better generalizes to real world faces, and runs at over 250 Hz.Comment: CVPR 2018 (Oral). Project webpage: https://gvv.mpi-inf.mpg.de/projects/FML
    • 

    corecore