163,485 research outputs found
Face Alignment Using Active Shape Model And Support Vector Machine
The Active Shape Model (ASM) is one of the most popular local texture models
for face alignment. It applies in many fields such as locating facial features
in the image, face synthesis, etc. However, the experimental results show that
the accuracy of the classical ASM for some applications is not high. This paper
suggests some improvements on the classical ASM to increase the performance of
the model in the application: face alignment. Four of our major improvements
include: i) building a model combining Sobel filter and the 2-D profile in
searching face in image; ii) applying Canny algorithm for the enhancement edge
on image; iii) Support Vector Machine (SVM) is used to classify landmarks on
face, in order to determine exactly location of these landmarks support for
ASM; iv)automatically adjust 2-D profile in the multi-level model based on the
size of the input image. The experimental results on Caltech face database and
Technical University of Denmark database (imm_face) show that our proposed
improvement leads to far better performance.Comment: 11 pages and 11 figure
Deep Face Feature for Face Alignment
In this paper, we present a deep learning based image feature extraction
method designed specifically for face images. To train the feature extraction
model, we construct a large scale photo-realistic face image dataset with
ground-truth correspondence between multi-view face images, which are
synthesized from real photographs via an inverse rendering procedure. The deep
face feature (DFF) is trained using correspondence between face images rendered
from different views. Using the trained DFF model, we can extract a feature
vector for each pixel of a face image, which distinguishes different facial
regions and is shown to be more effective than general-purpose feature
descriptors for face-related tasks such as matching and alignment. Based on the
DFF, we develop a robust face alignment method, which iteratively updates
landmarks, pose and 3D shape. Extensive experiments demonstrate that our method
can achieve state-of-the-art results for face alignment under highly
unconstrained face images
Facial Landmark Detection: a Literature Survey
The locations of the fiducial facial landmark points around facial components
and facial contour capture the rigid and non-rigid facial deformations due to
head movements and facial expressions. They are hence important for various
facial analysis tasks. Many facial landmark detection algorithms have been
developed to automatically detect those key points over the years, and in this
paper, we perform an extensive review of them. We classify the facial landmark
detection algorithms into three major categories: holistic methods, Constrained
Local Model (CLM) methods, and the regression-based methods. They differ in the
ways to utilize the facial appearance and shape information. The holistic
methods explicitly build models to represent the global facial appearance and
shape information. The CLMs explicitly leverage the global shape model but
build the local appearance models. The regression-based methods implicitly
capture facial shape and appearance information. For algorithms within each
category, we discuss their underlying theories as well as their differences. We
also compare their performances on both controlled and in the wild benchmark
datasets, under varying facial expressions, head poses, and occlusion. Based on
the evaluations, we point out their respective strengths and weaknesses. There
is also a separate section to review the latest deep learning-based algorithms.
The survey also includes a listing of the benchmark databases and existing
software. Finally, we identify future research directions, including combining
methods in different categories to leverage their respective strengths to solve
landmark detection "in-the-wild"
Faster Than Real-time Facial Alignment: A 3D Spatial Transformer Network Approach in Unconstrained Poses
Facial alignment involves finding a set of landmark points on an image with a
known semantic meaning. However, this semantic meaning of landmark points is
often lost in 2D approaches where landmarks are either moved to visible
boundaries or ignored as the pose of the face changes. In order to extract
consistent alignment points across large poses, the 3D structure of the face
must be considered in the alignment step. However, extracting a 3D structure
from a single 2D image usually requires alignment in the first place. We
present our novel approach to simultaneously extract the 3D shape of the face
and the semantically consistent 2D alignment through a 3D Spatial Transformer
Network (3DSTN) to model both the camera projection matrix and the warping
parameters of a 3D model. By utilizing a generic 3D model and a Thin Plate
Spline (TPS) warping function, we are able to generate subject specific 3D
shapes without the need for a large 3D shape basis. In addition, our proposed
network can be trained in an end-to-end framework on entirely synthetic data
from the 300W-LP dataset. Unlike other 3D methods, our approach only requires
one pass through the network resulting in a faster than real-time alignment.
Evaluations of our model on the Annotated Facial Landmarks in the Wild (AFLW)
and AFLW2000-3D datasets show our method achieves state-of-the-art performance
over other 3D approaches to alignment.Comment: International Conference on Computer Vision (ICCV) 201
L2GSCI: Local to Global Seam Cutting and Integrating for Accurate Face Contour Extraction
Current face alignment algorithms can robustly find a set of landmarks along
face contour. However, the landmarks are sparse and lack curve details,
especially in chin and cheek areas where a lot of concave-convex bending
information exists. In this paper, we propose a local to global seam cutting
and integrating algorithm (L2GSCI) to extract continuous and accurate face
contour. Our method works in three steps with the help of a rough initial
curve. First, we sample small and overlapped squares along the initial curve.
Second, the seam cutting part of L2GSCI extracts a local seam in each square
region. Finally, the seam integrating part of L2GSCI connects all the redundant
seams together to form a continuous and complete face curve. Overall, the
proposed method is much more straightforward than existing face alignment
algorithms, but can achieve pixel-level continuous face curves rather than
discrete and sparse landmarks. Moreover, experiments on two face benchmark
datasets (i.e., LFPW and HELEN) show that our method can precisely reveal
concave-convex bending details of face contours, which has significantly
improved the performance when compared with the state-ofthe- art face alignment
approaches
Face Alignment Robust to Pose, Expressions and Occlusions
We propose an Ensemble of Robust Constrained Local Models for alignment of
faces in the presence of significant occlusions and of any unknown pose and
expression. To account for partial occlusions we introduce, Robust Constrained
Local Models, that comprises of a deformable shape and local landmark
appearance model and reasons over binary occlusion labels. Our occlusion
reasoning proceeds by a hypothesize-and-test search over occlusion labels.
Hypotheses are generated by Constrained Local Model based shape fitting over
randomly sampled subsets of landmark detector responses and are evaluated by
the quality of face alignment. To span the entire range of facial pose and
expression variations we adopt an ensemble of independent Robust Constrained
Local Models to search over a discretized representation of pose and
expression. We perform extensive evaluation on a large number of face images,
both occluded and unoccluded. We find that our face alignment system trained
entirely on facial images captured "in-the-lab" exhibits a high degree of
generalization to facial images captured "in-the-wild". Our results are
accurate and stable over a wide spectrum of occlusions, pose and expression
variations resulting in excellent performance on many real-world face datasets
Face Alignment by Local Deep Descriptor Regression
We present an algorithm for extracting key-point descriptors using deep
convolutional neural networks (CNN). Unlike many existing deep CNNs, our model
computes local features around a given point in an image. We also present a
face alignment algorithm based on regression using these local descriptors. The
proposed method called Local Deep Descriptor Regression (LDDR) is able to
localize face landmarks of varying sizes, poses and occlusions with high
accuracy. Deep Descriptors presented in this paper are able to uniquely and
efficiently describe every pixel in the image and therefore can potentially
replace traditional descriptors such as SIFT and HOG. Extensive evaluations on
five publicly available unconstrained face alignment datasets show that our
deep descriptor network is able to capture strong local features around a given
landmark and performs significantly better than many competitive and
state-of-the-art face alignment algorithms
Hybrid eye center localization using cascaded regression and hand-crafted model fitting
We propose a new cascaded regressor for eye center detection. Previous
methods start from a face or an eye detector and use either advanced features
or powerful regressors for eye center localization, but not both. Instead, we
detect the eyes more accurately using an existing facial feature alignment
method. We improve the robustness of localization by using both advanced
features and powerful regression machinery. Unlike most other methods that do
not refine the regression results, we make the localization more accurate by
adding a robust circle fitting post-processing step. Finally, using a simple
hand-crafted method for eye center localization, we show how to train the
cascaded regressor without the need for manually annotated training data. We
evaluate our new approach and show that it achieves state-of-the-art
performance on the BioID, GI4E, and the TalkingFace datasets. At an average
normalized error of e < 0.05, the regressor trained on manually annotated data
yields an accuracy of 95.07% (BioID), 99.27% (GI4E), and 95.68% (TalkingFace).
The automatically trained regressor is nearly as good, yielding an accuracy of
93.9% (BioID), 99.27% (GI4E), and 95.46% (TalkingFace).Comment: 12 pages, 5 figures, submitted to Journal of Image and Vision
Computin
A fast online cascaded regression algorithm for face alignment
Traditional face alignment based on machine learning usually tracks the
localizations of facial landmarks employing a static model trained offline
where all of the training data is available in advance. When new training
samples arrive, the static model must be retrained from scratch, which is
excessively time-consuming and memory-consuming. In many real-time
applications, the training data is obtained one by one or batch by batch. It
results in that the static model limits its performance on sequential images
with extensive variations. Therefore, the most critical and challenging aspect
in this field is dynamically updating the tracker's models to enhance
predictive and generalization capabilities continuously. In order to address
this question, we develop a fast and accurate online learning algorithm for
face alignment. Particularly, we incorporate on-line sequential extreme
learning machine into a parallel cascaded regression framework, coined
incremental cascade regression(ICR). To the best of our knowledge, this is the
first incremental cascaded framework with the non-linear regressor. One main
advantage of ICR is that the tracker model can be fast updated in an
incremental way without the entire retraining process when a new input is
incoming. Experimental results demonstrate that the proposed ICR is more
accurate and efficient on still or sequential images compared with the recent
state-of-the-art cascade approaches. Furthermore, the incremental learning
proposed in this paper can update the trained model in real time
Joint Voxel and Coordinate Regression for Accurate 3D Facial Landmark Localization
3D face shape is more expressive and viewpoint-consistent than its 2D
counterpart. However, 3D facial landmark localization in a single image is
challenging due to the ambiguous nature of landmarks under 3D perspective.
Existing approaches typically adopt a suboptimal two-step strategy, performing
2D landmark localization followed by depth estimation. In this paper, we
propose the Joint Voxel and Coordinate Regression (JVCR) method for 3D facial
landmark localization, addressing it more effectively in an end-to-end fashion.
First, a compact volumetric representation is proposed to encode the per-voxel
likelihood of positions being the 3D landmarks. The dimensionality of such a
representation is fixed regardless of the number of target landmarks, so that
the curse of dimensionality could be avoided. Then, a stacked hourglass network
is adopted to estimate the volumetric representation from coarse to fine,
followed by a 3D convolution network that takes the estimated volume as input
and regresses 3D coordinates of the face shape. In this way, the 3D structural
constraints between landmarks could be learned by the neural network in a more
efficient manner. Moreover, the proposed pipeline enables end-to-end training
and improves the robustness and accuracy of 3D facial landmark localization.
The effectiveness of our approach is validated on the 3DFAW and AFLW2000-3D
datasets. Experimental results show that the proposed method achieves
state-of-the-art performance in comparison with existing methods.Comment: Code available at https://github.com/HongwenZhang/JVCR-3Dlandmar
- …