3,353 research outputs found

    On Shape-Mediated Enrolment in Ear Biometrics

    No full text
    Ears are a new biometric with major advantage in that they appear to maintain their shape with increased age. Any automatic biometric system needs enrolment to extract the target area from the background. In ear biometrics the inputs are often human head profile images. Furthermore ear biometrics is concerned with the effects of partial occlusion mostly caused by hair and earrings. We propose an ear enrolment algorithm based on finding the elliptical shape of the ear using a Hough Transform (HT) accruing tolerance to noise and occlusion. Robustness is improved further by enforcing some prior knowledge. We assess our enrolment on two face profile datasets; as well as synthetic occlusion

    End-to-end Lip-reading: A Preliminary Study

    Get PDF
    Deep lip-reading is the combination of the domains of computer vision and natural language processing. It uses deep neural networks to extract speech from silent videos. Most works in lip-reading use a multi staged training approach due to the complex nature of the task. A single stage, end-to-end, unified training approach, which is an ideal of machine learning, is also the goal in lip-reading. However, pure end-to-end systems have not yet been able to perform as good as non-end-to-end systems. Some exceptions to this are the very recent Temporal Convolutional Network (TCN) based architectures. This work lays out preliminary study of deep lip-reading, with a special focus on various end-to-end approaches. The research aims to test whether a purely end-to-end approach is justifiable for a task as complex as deep lip-reading. To achieve this, the meaning of pure end-to-end is first defined and several lip-reading systems that follow the definition are analysed. The system that most closely matches the definition is then adapted for pure end-to-end experiments. Four main contributions have been made: i) An analysis of 9 different end-to-end deep lip-reading systems, ii) Creation and public release of a pipeline1 to adapt sentence level Lipreading Sentences in the Wild 3 (LRS3) dataset into word level, iii) Pure end-to-end training of a TCN based network and evaluation on LRS3 word-level dataset as a proof of concept, iv) a public online portal2 to analyse visemes and experiment live end-to-end lip-reading inference. The study is able to verify that pure end-to-end is a sensible approach and an achievable goal for deep machine lip-reading

    Efficient 3D Face Recognition with Gabor Patched Spectral Regression

    Get PDF
    In this paper, we utilize a novel framework for 3D face recognition, called 3D Gabor Patched Spectral Regression (3D GPSR), which can overcome some of the continuing challenges encountered with 2D or 3D facial images. In this active field, some obstacles, like expression variations, pose correction and data noise deteriorate the performance significantly. Our proposed system addresses these problems by first extracting the main facial area to remove irrelevant information corresponding to shoulders and necks. Pose correction is used to minimize the influence of large pose variations and then the normalized depth and gray images can be obtained. Due to better time-frequency characteristics and a distinctive biological background, the Gabor feature is extracted on depth images, known as 3D Gabor faces. Data noise is mainly caused by distorted meshes, varieties of subordinates and misalignment. To solve these problems, we introduce a Patched Spectral Regression strategy, which can make good use of the robustness and efficiency of accurate patched discriminant low-dimension features and minimize the effect of noise term. Computational analysis shows that spectral regression is much faster than the traditional approaches. Our experiments are based on the CASIA and FRGC 3D face databases which contain a huge number of challenging data. Experimental results show that our framework consistently outperforms the other existing methods with the distinctive characteristics of efficiency, robustness and generality

    Expressive Body Capture: 3D Hands, Face, and Body from a Single Image

    Full text link
    To facilitate the analysis of human actions, interactions and emotions, we compute a 3D model of human body pose, hand pose, and facial expression from a single monocular image. To achieve this, we use thousands of 3D scans to train a new, unified, 3D model of the human body, SMPL-X, that extends SMPL with fully articulated hands and an expressive face. Learning to regress the parameters of SMPL-X directly from images is challenging without paired images and 3D ground truth. Consequently, we follow the approach of SMPLify, which estimates 2D features and then optimizes model parameters to fit the features. We improve on SMPLify in several significant ways: (1) we detect 2D features corresponding to the face, hands, and feet and fit the full SMPL-X model to these; (2) we train a new neural network pose prior using a large MoCap dataset; (3) we define a new interpenetration penalty that is both fast and accurate; (4) we automatically detect gender and the appropriate body models (male, female, or neutral); (5) our PyTorch implementation achieves a speedup of more than 8x over Chumpy. We use the new method, SMPLify-X, to fit SMPL-X to both controlled images and images in the wild. We evaluate 3D accuracy on a new curated dataset comprising 100 images with pseudo ground-truth. This is a step towards automatic expressive human capture from monocular RGB data. The models, code, and data are available for research purposes at https://smpl-x.is.tue.mpg.de.Comment: To appear in CVPR 201

    Accurate segmentation and registration of skin lesion images to evaluate lesion change

    Full text link
    Skin cancer is a major health problem. There are several techniques to help diagnose skin lesions from a captured image. Computer-aided diagnosis (CAD) systems operate on single images of skin lesions, extracting lesion features to further classify them and help the specialists. Accurate feature extraction, which later on depends on precise lesion segmentation, is key for the performance of these systems. In this paper, we present a skin lesion segmentation algorithm based on a novel adaptation of superpixels techniques and achieve the best reported results for the ISIC 2017 challenge dataset. Additionally, CAD systems have paid little attention to a critical criterion in skin lesion diagnosis: the lesion's evolution. This requires operating on two or more images of the same lesion, captured at different times but with a comparable scale, orientation, and point of view; in other words, an image registration process should first be performed. We also propose in this work, an image registration approach that outperforms top image registration techniques. Combined with the proposed lesion segmentation algorithm, this allows for the accurate extraction of features to assess the evolution of the lesion. We present a case study with the lesion-size feature, paving the way for the development of automatic systems to easily evaluate skin lesion evolutionThis work was supported in part by the Spanish Government (HAVideo, TEC2014-53176-R) and in part by the TEC department (Universidad Autonoma de Madrid
    corecore