17 research outputs found
Automatic landmark annotation and dense correspondence registration for 3D human facial images
Dense surface registration of three-dimensional (3D) human facial images
holds great potential for studies of human trait diversity, disease genetics,
and forensics. Non-rigid registration is particularly useful for establishing
dense anatomical correspondences between faces. Here we describe a novel
non-rigid registration method for fully automatic 3D facial image mapping. This
method comprises two steps: first, seventeen facial landmarks are automatically
annotated, mainly via PCA-based feature recognition following 3D-to-2D data
transformation. Second, an efficient thin-plate spline (TPS) protocol is used
to establish the dense anatomical correspondence between facial images, under
the guidance of the predefined landmarks. We demonstrate that this method is
robust and highly accurate, even for different ethnicities. The average face is
calculated for individuals of Han Chinese and Uyghur origins. While fully
automatic and computationally efficient, this method enables high-throughput
analysis of human facial feature variation.Comment: 33 pages, 6 figures, 1 tabl
A quantitative assessment of 3D facial key point localization fitting 2D shape models to curvature information
This work addresses the localization of 11 prominent facial landmarks in 3D by fitting state of the art shape models to 2D data. Quantitative results are provided for 34 scans at high resolution (texture maps of 10 M-pixels) in terms of accuracy (with respect to manual measurements) and precision (repeatability on different images from the same individual). We obtain an average accuracy of approximately 3 mm, and median repeatability of inter-landmark distances typically below 2 mm, which are values comparable to current algorithms on automatic localization of facial landmarks. We also show that, in our experiments, the replacement of texture information by curvature features produced little change in performance, which is an important finding as it suggests the applicability of the method to any type of 3D data
Compensating inaccurate annotations to train 3D facial landmark localisation models
In this paper we investigate the impact of inconsistency in manual annotations when they are used to train automatic models for 3D facial landmark localization. We start by showing that it is possible to objectively measure the consistency of annotations in a database, provided that it contains replicates (i.e. repeated scans from the same person). Applying such measure to the widely used FRGC database we find that manual annotations currently available are suboptimal and can strongly impair the accuracy of automatic models learnt therefrom. To address this issue, we present a simple algorithm to automatically correct a set of annotations and show that it can help to significantly improve the accuracy of the models in terms of landmark localization errors. This improvement is observed even when errors are measured with respect to the original (not corrected) annotations. However, we also show that if errors are computed against an alternative set of manual annotations with higher consistency, the accuracy of the models constructed using the corrections from the presented algorithm tends to converge to the one achieved by building the models on the alternative,more consistent set
3D Shape Descriptor-Based Facial Landmark Detection: A Machine Learning Approach
Facial landmark detection on 3D human faces has had numerous applications in the literature
such as establishing point-to-point correspondence between 3D face models which is itself a
key step for a wide range of applications like 3D face detection and authentication, matching,
reconstruction, and retrieval, to name a few.
Two groups of approaches, namely knowledge-driven and data-driven approaches, have been
employed for facial landmarking in the literature. Knowledge-driven techniques are the
traditional approaches that have been widely used to locate landmarks on human faces. In
these approaches, a user with sucient knowledge and experience usually denes features to
be extracted as the landmarks. Data-driven techniques, on the other hand, take advantage
of machine learning algorithms to detect prominent features on 3D face models. Besides
the key advantages, each category of these techniques has limitations that prevent it from
generating the most reliable results.
In this work we propose to combine the strengths of the two approaches to detect facial
landmarks in a more ecient and precise way. The suggested approach consists of two phases.
First, some salient features of the faces are extracted using expert systems. Afterwards,
these points are used as the initial control points in the well-known Thin Plate Spline (TPS)
technique to deform the input face towards a reference face model. Second, by exploring and
utilizing multiple machine learning algorithms another group of landmarks are extracted.
The data-driven landmark detection step is performed in a supervised manner providing an
information-rich set of training data in which a set of local descriptors are computed and used
to train the algorithm. We then, use the detected landmarks for establishing point-to-point
correspondence between the 3D human faces mainly using an improved version of Iterative
Closest Point (ICP) algorithms. Furthermore, we propose to use the detected landmarks for
3D face matching applications
3D facial landmark localization using combinatorial search and shape regression
This paper presents a method for the automatic detection of facial landmarks. The algorithm receives a set of 3D candidate points for each landmark (e.g. from a feature detector) and performs combinatorial search constrained by a deformable shape model. A key assumption of our approach is that for some landmarks there might not be an accurate candidate in the input set. This is tackled by detecting partial subsets of landmarks and inferring those that are missing so that the probability of the deformable model is maximized. The ability of the model to work with incomplete information makes it possible to limit the number of candidates that need to be retained, substantially reducing the number of possible combinations to be tested with respect to the alternative of trying to always detect the complete set of landmarks. We demonstrate the accuracy of the proposed method in a set of 144 facial scans acquired by means of a hand-held laser scanner in the context of clinical craniofacial dysmorphology research. Using spin images to describe the geometry and targeting 11 facial landmarks, we obtain an average error below 3 mm, which compares favorably with other state of the art approaches based on geometric descriptors
Persistent homology to analyse 3D faces and assess body weight gain
In this paper, we analyse patterns in face shape variation due to weight gain. We propose the use of persistent homology descriptors to get geometric and topological information about the configuration of anthropometric 3D face landmarks. In this way, evaluating face changes boils down to comparing the descriptors computed on 3D face scans taken at different times. By applying dimensionality reduction techniques to the dissimilarity matrix of descriptors, we get a space in which each face is a point and face shape variations are encoded as trajectories in that space. Our results show that persistent homology is able to identify features which are well related to overweight and may help assessing individual weight trends. The research was carried out in the context of the European project SEMEOTICONS, which developed a multisensory platform which detects and monitors over time facial signs of cardio-metabolic risk
Facial Texture Super-Resolution by Fitting 3D Face Models
This book proposes to solve the low-resolution (LR) facial analysis problem with 3D face super-resolution (FSR). A complete processing chain is presented towards effective 3D FSR in real world. To deal with the extreme challenges of incorporating 3D modeling under the ill-posed LR condition, a novel workflow coupling automatic localization of 2D facial feature points and 3D shape reconstruction is developed, leading to a robust pipeline for pose-invariant hallucination of the 3D facial texture
Geometric Expression Invariant 3D Face Recognition using Statistical Discriminant Models
Currently there is no complete face recognition system that is invariant to all facial expressions.
Although humans find it easy to identify and recognise faces regardless of changes in illumination,
pose and expression, producing a computer system with a similar capability has proved to
be particularly di cult. Three dimensional face models are geometric in nature and therefore
have the advantage of being invariant to head pose and lighting. However they are still susceptible
to facial expressions. This can be seen in the decrease in the recognition results using
principal component analysis when expressions are added to a data set.
In order to achieve expression-invariant face recognition systems, we have employed a tensor
algebra framework to represent 3D face data with facial expressions in a parsimonious
space. Face variation factors are organised in particular subject and facial expression modes.
We manipulate this using single value decomposition on sub-tensors representing one variation
mode. This framework possesses the ability to deal with the shortcomings of PCA in less constrained
environments and still preserves the integrity of the 3D data. The results show improved
recognition rates for faces and facial expressions, even recognising high intensity expressions
that are not in the training datasets.
We have determined, experimentally, a set of anatomical landmarks that best describe facial
expression e ectively. We found that the best placement of landmarks to distinguish di erent
facial expressions are in areas around the prominent features, such as the cheeks and eyebrows.
Recognition results using landmark-based face recognition could be improved with better placement.
We looked into the possibility of achieving expression-invariant face recognition by reconstructing
and manipulating realistic facial expressions. We proposed a tensor-based statistical
discriminant analysis method to reconstruct facial expressions and in particular to neutralise
facial expressions. The results of the synthesised facial expressions are visually more realistic
than facial expressions generated using conventional active shape modelling (ASM). We
then used reconstructed neutral faces in the sub-tensor framework for recognition purposes.
The recognition results showed slight improvement. Besides biometric recognition, this novel
tensor-based synthesis approach could be used in computer games and real-time animation
applications
Facial Expression Analysis under Partial Occlusion: A Survey
Automatic machine-based Facial Expression Analysis (FEA) has made substantial
progress in the past few decades driven by its importance for applications in
psychology, security, health, entertainment and human computer interaction. The
vast majority of completed FEA studies are based on non-occluded faces
collected in a controlled laboratory environment. Automatic expression
recognition tolerant to partial occlusion remains less understood, particularly
in real-world scenarios. In recent years, efforts investigating techniques to
handle partial occlusion for FEA have seen an increase. The context is right
for a comprehensive perspective of these developments and the state of the art
from this perspective. This survey provides such a comprehensive review of
recent advances in dataset creation, algorithm development, and investigations
of the effects of occlusion critical for robust performance in FEA systems. It
outlines existing challenges in overcoming partial occlusion and discusses
possible opportunities in advancing the technology. To the best of our
knowledge, it is the first FEA survey dedicated to occlusion and aimed at
promoting better informed and benchmarked future work.Comment: Authors pre-print of the article accepted for publication in ACM
Computing Surveys (accepted on 02-Nov-2017
3D hand pose estimation using convolutional neural networks
3D hand pose estimation plays a fundamental role in natural human computer interactions. The problem is challenging due to complicated variations caused by complex articulations, multiple viewpoints, self-similar parts, severe self-occlusions, different shapes and sizes.
To handle these challenges, the thesis makes the following contributions. First, the problem of the multiple viewpoints and complex articulations of hand pose estimation is tackled by decomposing and transforming the input and output space by spatial transformations following the hand structure. By the transformation, both the variation of the input space and output is reduced, which makes the learning easier.
The second contribution is a probabilistic framework integrating all the hierarchical regressions. Variants with/without sampling, using different regressors and optimization methods are constructed and compared to provide an insight of the components under this framework.
The third contribution is based on the observation that for images with occlusions, there exist multiple plausible configurations for the occluded parts.
A hierarchical mixture density network is proposed to handle the multi-modality of the locations for occluded hand joints. It leverages the state-of-the-art hand pose estimators based on Convolutional Neural Networks to facilitate feature learning while models the multiple modes in a two-level hierarchy to reconcile single-valued (for visible joints) and multi-valued (for occluded joints) mapping in its output.
In addition, a complete labeled real hand datasets is collected by a tracking system with six 6D magnetic sensors and inverse kinematics to automatically obtain 21-joints hand pose annotations of depth maps.Open Acces