20 research outputs found

    Markerless facial motion capture: deep learning approaches on RGBD data

    Get PDF
    Facial expressions are a series of fast, complex and interconnected movement that causes an array of deformations, such as stretching, compressing and folding of the skin. Identifying expression is a natural process in human vision, but due to the diversity of faces, it has many challenges for computer vision. Research in markerless facial motion capture using single Red Green Blue (RGB) camera has gained popularity due to the wide access of the data, such as from mobile phones. The motivation behind this work is much of the existing work attempts to infer the 3-Dimensional (3D) data from 2-Dimensional (2D) images, such as in motion capture multiple 2D cameras are calibration to allow some depth prediction. Whereas, the inclusion of Red Green Blue Depth (RGBD) sensors that give ground truth depth data could gain a better understanding of the human face and how expressions are visualised. The aim of this thesis is to investigate and develop novel methods of markerless facial motion capture, where the focus is on the inclusions of RGBD data to provide 3D data. The contributions are: A tool to aid in the annotation of 3D facial landmarks; A novel neural network that demonstrate the ability of predicting 2D and 3D landmarks by merging RGBD data; Working application that demonstrates complex deep learning network on portable handheld devices; A review of existing methods of denoising fine detail in depth maps using neural networks; A network for the complete analysis of facial landmarks and expressions in 3D. The 3D annotator was developed to overcome the issues of relying on existing 3D modelling software, which made feature identification difficult. The technique of predicting 2D and 3D with auxiliary information, allowed high accuracy 3D landmarking, without the need for full model generation. Also, it outperformed other recent techniques of landmarking. The networks running on the handheld devices show as a proof of concept that even without much optimisation, a complex task can be performed in near real-time. Denoising Time of Flight (ToF) depth maps, showed much more complexity than the tradition RGB denoising, where we reviewed and applied an array of techniques to the task. The full facial analysis showed that when neural networks perform on a wide range of related task for auxiliary information allow for deep understanding of the overall task. The research for facial processing is vast, but still with many new problems and challenges to face and improve upon. While RGB cameras are used widely, we see the inclusion of high accuracy and cost-effective depth sensing device available. The new devices allow better understanding of facial features and expression. By using and merging RGB data, the area of facial landmarking, and expression intensity recognition can be improved

    Dermoscopic Dark Corner Artifacts Removal: Friend or Foe?

    Full text link
    One of the more significant obstacles in classification of skin cancer is the presence of artifacts. This paper investigates the effect of dark corner artifacts, which result from the use of dermoscopes, on the performance of a deep learning binary classification task. Previous research attempted to remove and inpaint dark corner artifacts, with the intention of creating an ideal condition for models. However, such research has been shown to be inconclusive due to lack of available datasets labelled with dark corner artifacts and detailed analysis and discussion. To address these issues, we label 10,250 skin lesion images from publicly available datasets and introduce a balanced dataset with an equal number of melanoma and non-melanoma cases. The training set comprises 6126 images without artifacts, and the testing set comprises 4124 images with dark corner artifacts. We conduct three experiments to provide new understanding on the effects of dark corner artifacts, including inpainted and synthetically generated examples, on a deep learning method. Our results suggest that introducing synthetic dark corner artifacts which have been superimposed onto the training set improved model performance, particularly in terms of the true negative rate. This indicates that deep learning learnt to ignore dark corner artifacts, rather than treating it as melanoma, when dark corner artifacts were introduced into the training set. Further, we propose a new approach to quantifying heatmaps indicating network focus using a root mean square measure of the brightness intensity in the different regions of the heatmaps. This paper provides a new guideline for skin lesions analysis with an emphasis on reproducibility

    MiTiSegmenter: Software for high throughput segmentation and meshing of microCT data in microtiter plate arrays

    Get PDF
    Lab-based microCT is a powerful means of visualising the internal structure of physical specimens deployed across the physical sciences, engineering and the arts. As its popularity has grown, demand for bulk digitisation of multiple samples within a single scan has increased. High throughput workflows can increase sample sizes and reduce scan time, yet downstream segmentation and meshing remain a bottleneck. We present MiTiSegmenter as a new tool for the bulk archiving of valuable zooarchaeological and palaeontological remains. We foresee MiTiSegmenter as particularly useful when incorporated into workflows that ultimately require the destructive testing of specimens, including sampling for ancient DNA and proteomics. The software may also play an important role in national museums' ongoing mass digitisation efforts, facilitating the high-speed archiving of specimen 3D morphology across extensive collections with very minimal user intervention or prior training. - We present MiTiSegmenter, a software package for semi-automated image processing and segmentation of array-based batch microCT data. - Implemented in Python, MiTiSegmenter expedites cropping, meshing and exporting samples within stacked microtiter plates, facilitating the rapid digitisation of hundreds-thousands of samples per scan. - We illustrate MiTiSegmenter's capabilities when applied to bulk archiving of valuable zooarchaeological and palaeontological remains

    Automated Assessment of Facial Wrinkling: a case study on the effect of smoking

    Full text link
    Facial wrinkle is one of the most prominent biological changes that accompanying the natural aging process. However, there are some external factors contributing to premature wrinkles development, such as sun exposure and smoking. Clinical studies have shown that heavy smoking causes premature wrinkles development. However, there is no computerised system that can automatically assess the facial wrinkles on the whole face. This study investigates the effect of smoking on facial wrinkling using a social habit face dataset and an automated computerised computer vision algorithm. The wrinkles pattern represented in the intensity of 0-255 was first extracted using a modified Hybrid Hessian Filter. The face was divided into ten predefined regions, where the wrinkles in each region was extracted. Then the statistical analysis was performed to analyse which region is effected mainly by smoking. The result showed that the density of wrinkles for smokers in two regions around the mouth was significantly higher than the non-smokers, at p-value of 0.05. Other regions are inconclusive due to lack of large scale dataset. Finally, the wrinkle was visually compared between smoker and non-smoker faces by generating a generic 3D face model.Comment: 6 pages, 8 figures, Accepted in 2017 IEEE SMC International Conferenc

    R-MNet: A Perceptual Adversarial Network for Image Inpainting

    Get PDF
    Facial image inpainting is a problem that is widely studied, and in recent years the introduction of Generative Adversarial Networks, has led to improvements in the field. Unfortunately some issues persists, in particular when blending the missing pixels with the visible ones. We address the problem by proposing a Wasserstein GAN combined with a new reverse mask operator, namely Reverse Masking Network (R-MNet), a perceptual adversarial network for image inpainting. The reverse mask operator transfers the reverse masked image to the end of the encoder-decoder network leaving only valid pixels to be inpainted. Additionally, we propose a new loss function computed in feature space to target only valid pixels combined with adversarial training. These then capture data distributions and generate images similar to those in the training data with achieved realism (realistic and coherent) on the output images. We evaluate our method on publicly available dataset, and compare with state-of-the-art methods. We show that our method is able to generalize to high-resolution inpainting task, and further show more realistic outputs that are plausible to the human visual system when compared with the state-of-the-art methods.Comment: 10 pages, 7 figures, 3 table

    3D-CNN for Facial Micro- and Macro-expression Spotting on Long Video Sequences using Temporal Oriented Reference Frame

    Get PDF
    Facial expression spotting is the preliminary step for micro- and macro-expression analysis. The task of reliably spotting such expressions in video sequences is currently unsolved. The current best systems depend upon optical flow methods to extract regional motion features, before categorisation of that motion into a specific class of facial movement. Optical flow is susceptible to drift error, which introduces a serious problem for motions with long-term dependencies, such as high frame-rate macro-expression. We propose a purely deep learning solution which, rather than track frame differential motion, compares via a convolutional model, each frame with two temporally local reference frames. Reference frames are sampled according to calculated micro- and macro-expression durations. We show that our solution achieves state-of-the-art performance (F1-score of 0.126) in a dataset of high frame-rate (200 fps) long video sequences (SAMM-LV) and is competitive in a low frame-rate (30 fps) dataset (CAS(ME)2). In this paper, we document our deep learning model and parameters, including how we use local contrast normalisation, which we show is critical for optimal results. We surpass a limitation in existing methods, and advance the state of deep learning in the domain of facial expression spotting

    Interpretability of a deep learning based approach for the classification of skin lesions into main anatomic body sites

    Get PDF
    Over the past few decades, different clinical diagnostic algorithms have been proposed to diagnose malignant melanoma in its early stages. Furthermore, the detection of skin moles driven by current deep learning based approaches yields impressive results in the classification of malignant melanoma. However, in all these approaches, the researchers do not take into account the origin of the skin lesion. It has been observed that the specific criteria for in situ and early invasive melanoma highly depend on the anatomic site of the body. To address this problem, we propose a deep learning architecture based framework to classify skin lesions into the three most important anatomic sites, including the face, trunk and extremities, and acral lesions. In this study, we take advantage of pretrained networks, including VGG19, ResNet50, Xception, DenseNet121, and EfficientNetB0, to calculate the features with an adjusted and densely connected classifier. Furthermore, we perform in depth analysis on database, architecture, and result regarding the effectiveness of the proposed framework. Experiments confirm the ability of the developed algorithms to classify skin lesions into the most important anatomical sites with 91.45% overall accuracy for the EfficientNetB0 architecture, which is a state-of-the-art result in this domain
    corecore