6 research outputs found

    FacialSCDnet: A deep learning approach for the estimation of subject-to-camera distance in facial photographs

    Get PDF
    Facial biometrics play an essential role in the fields of law enforcement and forensic sciences. When comparing facial traits for human identification in photographs or videos, the analysis must account for several factors that impair the application of common identification techniques, such as illumination, pose, or expression. In particular, facial attributes can drastically change depending on the distance between the subject and the camera at the time of the picture. This effect is known as perspective distortion, which can severely affect the outcome of the comparative analysis. Hence, knowing the subject-to-camera distance of the original scene where the photograph was taken can help determine the degree of distortion, improve the accuracy of computer-aided recognition tools, and increase the reliability of human identification and further analyses. In this paper, we propose a deep learning approach to estimate the subject-to-camera distance of facial photographs: FacialSCDnet. Furthermore, we introduce a novel evaluation metric designed to guide the learning process, based on changes in facial distortion at different distances. To validate our proposal, we collected a novel dataset of facial photographs taken at several distances using both synthetic and real data. Our approach is fully automatic and can provide a numerical distance estimation for up to six meters, beyond which changes in facial distortion are not significant. The proposed method achieves an accurate estimation, with an average error below 6 cm of subject-to-camera distance for facial photographs in any frontal or lateral head pose, robust to facial hair, glasses, and partial occlusion.Departamento de Ciencias de la Computación y Sistemas Inteligente

    FacialSCDnet: A deep learning approach for the estimation of subject-to-camera distance in facial photographs

    Get PDF
    [Abstract]: Facial biometrics play an essential role in the fields of law enforcement and forensic sciences. When comparing facial traits for human identification in photographs or videos, the analysis must account for several factors that impair the application of common identification techniques, such as illumination, pose, or expression. In particular, facial attributes can drastically change depending on the distance between the subject and the camera at the time of the picture. This effect is known as perspective distortion, which can severely affect the outcome of the comparative analysis. Hence, knowing the subject-to-camera distance of the original scene where the photograph was taken can help determine the degree of distortion, improve the accuracy of computer-aided recognition tools, and increase the reliability of human identification and further analyses. In this paper, we propose a deep learning approach to estimate the subject-to-camera distance of facial photographs: FacialSCDnet. Furthermore, we introduce a novel evaluation metric designed to guide the learning process, based on changes in facial distortion at different distances. To validate our proposal, we collected a novel dataset of facial photographs taken at several distances using both synthetic and real data. Our approach is fully automatic and can provide a numerical distance estimation for up to six meters, beyond which changes in facial distortion are not significant. The proposed method achieves an accurate estimation, with an average error below 6 cm of subject-to-camera distance for facial photographs in any frontal or lateral head pose, robust to facial hair, glasses, and partial occlusion

    Real-time face view correction for front-facing cameras

    Get PDF
    Face view is particularly important in person-to-person communication. Disparity between the camera location and the face orientation can result in undesirable facial appearances of the participants during video conferencing. This phenomenon becomes particularly notable on devices where the front-facing camera is placed at unconventional locations such as below the display or within the keyboard. In this paper, we takes the video stream from a single RGB camera as input, and generates a video stream that emulates the view from a virtual camera at a designated location. The most challenging issue of this problem is that the corrected view often needs out-of-plane head rotations. To address this challenge, we reconstruct 3D face shape and re-render it into synthesized frames according to the virtual camera location. To output the corrected video stream with natural appearance in real-time, we propose several novel techniques including accurate eyebrow reconstruction, high-quality blending between corrected face image and background, and a template-based 3D reconstruction of glasses. Our system works well for different lighting conditions and skin tones, and is able to handle users wearing glasses. Extensive experiments and user studies demonstrate that our proposed method can achieve high-quality results

    FreeStyleGAN: Free-view Editable Portrait Rendering with the Camera Manifold

    Get PDF
    International audienceCurrent Generative Adversarial Networks (GANs) produce photorealisticrenderings of portrait images. Embedding real images into the latent spaceof such models enables high-level image editing. While recent methodsprovide considerable semantic control over the (re-)generated images, theycan only generate a limited set of viewpoints and cannot explicitly controlthe camera. Such 3D camera control is required for 3D virtual and mixedreality applications. In our solution, we use a few images of a face to perform3D reconstruction, and we introduce the notion of the GAN camera manifold,the key element allowing us to precisely define the range of images that theGAN can reproduce in a stable manner. We train a small face-specific neuralimplicit representation network to map a captured face to this manifoldand complement it with a warping scheme to obtain free-viewpoint novel-view synthesis. We show how our approach ś due to its precise cameracontrol ś enables the integration of a pre-trained StyleGAN into standard 3Drendering pipelines, allowing e.g., stereo rendering or consistent insertionof faces in synthetic 3D environments. Our solution proposes the first trulyfree-viewpoint rendering of realistic faces at interactive rates, using onlya small number of casual photos as input, while simultaneously allowingsemantic editing capabilities, such as facial expression or lighting changes