18,211 research outputs found
Modeling of Facial Aging and Kinship: A Survey
Computational facial models that capture properties of facial cues related to
aging and kinship increasingly attract the attention of the research community,
enabling the development of reliable methods for age progression, age
estimation, age-invariant facial characterization, and kinship verification
from visual data. In this paper, we review recent advances in modeling of
facial aging and kinship. In particular, we provide an up-to date, complete
list of available annotated datasets and an in-depth analysis of geometric,
hand-crafted, and learned facial representations that are used for facial aging
and kinship characterization. Moreover, evaluation protocols and metrics are
reviewed and notable experimental results for each surveyed task are analyzed.
This survey allows us to identify challenges and discuss future research
directions for the development of robust facial models in real-world
conditions
Deep generative-contrastive networks for facial expression recognition
As the expressive depth of an emotional face differs with individuals or
expressions, recognizing an expression using a single facial image at a moment
is difficult. A relative expression of a query face compared to a reference
face might alleviate this difficulty. In this paper, we propose to utilize
contrastive representation that embeds a distinctive expressive factor for a
discriminative purpose. The contrastive representation is calculated at the
embedding layer of deep networks by comparing a given (query) image with the
reference image. We attempt to utilize a generative reference image that is
estimated based on the given image. Consequently, we deploy deep neural
networks that embed a combination of a generative model, a contrastive model,
and a discriminative model with an end-to-end training manner. In our proposed
networks, we attempt to disentangle a facial expressive factor in two steps
including learning of a generator network and a contrastive encoder network. We
conducted extensive experiments on publicly available face expression databases
(CK+, MMI, Oulu-CASIA, and in-the-wild databases) that have been widely adopted
in the recent literatures. The proposed method outperforms the known
state-of-the art methods in terms of the recognition accuracy
Face Recognition: From Traditional to Deep Learning Methods
Starting in the seventies, face recognition has become one of the most
researched topics in computer vision and biometrics. Traditional methods based
on hand-crafted features and traditional machine learning techniques have
recently been superseded by deep neural networks trained with very large
datasets. In this paper we provide a comprehensive and up-to-date literature
review of popular face recognition methods including both traditional
(geometry-based, holistic, feature-based and hybrid methods) and deep learning
methods
Graphical Representation for Heterogeneous Face Recognition
Heterogeneous face recognition (HFR) refers to matching face images acquired
from different sources (i.e., different sensors or different wavelengths) for
identification. HFR plays an important role in both biometrics research and
industry. In spite of promising progresses achieved in recent years, HFR is
still a challenging problem due to the difficulty to represent two
heterogeneous images in a homogeneous manner. Existing HFR methods either
represent an image ignoring the spatial information, or rely on a
transformation procedure which complicates the recognition task. Considering
these problems, we propose a novel graphical representation based HFR method
(G-HFR) in this paper. Markov networks are employed to represent heterogeneous
image patches separately, which takes the spatial compatibility between
neighboring image patches into consideration. A coupled representation
similarity metric (CRSM) is designed to measure the similarity between obtained
graphical representations. Extensive experiments conducted on multiple HFR
scenarios (viewed sketch, forensic sketch, near infrared image, and thermal
infrared image) show that the proposed method outperforms state-of-the-art
methods.Comment: 13 pages, 10 figures, TPAMI 2016 accepte
What comprises a good talking-head video generation?: A Survey and Benchmark
Over the years, performance evaluation has become essential in computer
vision, enabling tangible progress in many sub-fields. While talking-head video
generation has become an emerging research topic, existing evaluations on this
topic present many limitations. For example, most approaches use human subjects
(e.g., via Amazon MTurk) to evaluate their research claims directly. This
subjective evaluation is cumbersome, unreproducible, and may impend the
evolution of new research. In this work, we present a carefully-designed
benchmark for evaluating talking-head video generation with standardized
dataset pre-processing strategies. As for evaluation, we either propose new
metrics or select the most appropriate ones to evaluate results in what we
consider as desired properties for a good talking-head video, namely, identity
preserving, lip synchronization, high video quality, and natural-spontaneous
motion. By conducting a thoughtful analysis across several state-of-the-art
talking-head generation approaches, we aim to uncover the merits and drawbacks
of current methods and point out promising directions for future work. All the
evaluation code is available at:
https://github.com/lelechen63/talking-head-generation-survey
Can Synthetic Faces Undo the Damage of Dataset Bias to Face Recognition and Facial Landmark Detection?
It is well known that deep learning approaches to face recognition and facial
landmark detection suffer from biases in modern training datasets. In this
work, we propose to use synthetic face images to reduce the negative effects of
dataset biases on these tasks. Using a 3D morphable face model, we generate
large amounts of synthetic face images with full control over facial shape and
color, pose, illumination, and background. With a series of experiments, we
extensively test the effects of priming deep nets by pre-training them with
synthetic faces. We observe the following positive effects for face recognition
and facial landmark detection tasks: 1) Priming with synthetic face images
improves the performance consistently across all benchmarks because it reduces
the negative effects of biases in the training data. 2) Traditional approaches
for reducing the damage of dataset bias, such as data augmentation and transfer
learning, are less effective than training with synthetic faces. 3) Using
synthetic data, we can reduce the size of real-world datasets by 75% for face
recognition and by 50% for facial landmark detection while maintaining
performance. Thus, offering a means to focus the data collection process on
less but higher quality data.Comment: Technical repor
A Survey of the Trends in Facial and Expression Recognition Databases and Methods
Automated facial identification and facial expression recognition have been
topics of active research over the past few decades. Facial and expression
recognition find applications in human-computer interfaces, subject tracking,
real-time security surveillance systems and social networking. Several holistic
and geometric methods have been developed to identify faces and expressions
using public and local facial image databases. In this work we present the
evolution in facial image data sets and the methodologies for facial
identification and recognition of expressions such as anger, sadness,
happiness, disgust, fear and surprise. We observe that most of the earlier
methods for facial and expression recognition aimed at improving the
recognition rates for facial feature-based methods using static images.
However, the recent methodologies have shifted focus towards robust
implementation of facial/expression recognition from large image databases that
vary with space (gathered from the internet) and time (video recordings). The
evolution trends in databases and methodologies for facial and expression
recognition can be useful for assessing the next-generation topics that may
have applications in security systems or personal identification systems that
involve "Quantitative face" assessments.Comment: 16 pages, 4 figures, 3 tables, International Journal of Computer
Science and Engineering Survey, October, 201
A Novel Space-Time Representation on the Positive Semidefinite Con for Facial Expression Recognition
In this paper, we study the problem of facial expression recognition using a
novel space-time geometric representation. We describe the temporal evolution
of facial landmarks as parametrized trajectories on the Riemannian manifold of
positive semidefinite matrices of fixed-rank. Our representation has the
advantage to bring naturally a second desirable quantity when comparing shapes
-- the spatial covariance -- in addition to the conventional affine-shape
representation. We derive then geometric and computational tools for
rate-invariant analysis and adaptive re-sampling of trajectories, grounding on
the Riemannian geometry of the manifold. Specifically, our approach involves
three steps: 1) facial landmarks are first mapped into the Riemannian manifold
of positive semidefinite matrices of rank 2, to build time-parameterized
trajectories; 2) a temporal alignment is performed on the trajectories,
providing a geometry-aware (dis-)similarity measure between them; 3) finally,
pairwise proximity function SVM (ppfSVM) is used to classify them,
incorporating the latter (dis-)similarity measure into the kernel function. We
show the effectiveness of the proposed approach on four publicly available
benchmarks (CK+, MMI, Oulu-CASIA, and AFEW). The results of the proposed
approach are comparable to or better than the state-of-the-art methods when
involving only facial landmarks.Comment: To be appeared at ICCV 201
A Supervised Learning Methodology for Real-Time Disguised Face Recognition in the Wild
Facial recognition has always been a challeng- ing task for computer vision
scientists and experts. Despite complexities arising due to variations in
camera parameters, illumination and face orientations, significant progress has
been made in the field with deep learning algorithms now competing with
human-level accuracy. But in contrast to the recent advances in face
recognition techniques, Disguised Facial Identification continues to be a
tougher challenge in the field of computer vision. The modern day scenario,
where security is of prime concern, regular face identification techniques do
not perform as required when the faces are disguised, which calls for a
different approach to handle situations where intruders have their faces
masked. Along the same lines, we propose a deep learning architecture for
disguised facial recognition (DFR). The algorithm put forward in this paper
detects 20 facial key-points in the first stage, using a 14-layered
convolutional neural network (CNN). These facial key-points are later utilized
by a support vector machine (SVM) for classifying the disguised faces based on
the euclidean distance ratios and angles between different facial key-points.
This overall architecture imparts a basic intelligence to our system. Our
key-point feature prediction accuracy is 65% while the classification rate is
72.4%. Moreover, the architecture works at 19 FPS, thereby performing in almost
real-time. The efficiency of our approach is also compared with the
state-of-the-art Disguised Facial Identification methods.Comment: Accepted at 2018 International Conference on Robotics and Computer
Visio
3D Face Modeling From Diverse Raw Scan Data
Traditional 3D face models learn a latent representation of faces using
linear subspaces from limited scans of a single database. The main roadblock of
building a large-scale face model from diverse 3D databases lies in the lack of
dense correspondence among raw scans. To address these problems, this paper
proposes an innovative framework to jointly learn a nonlinear face model from a
diverse set of raw 3D scan databases and establish dense point-to-point
correspondence among their scans. Specifically, by treating input scans as
unorganized point clouds, we explore the use of PointNet architectures for
converting point clouds to identity and expression feature representations,
from which the decoder networks recover their 3D face shapes. Further, we
propose a weakly supervised learning approach that does not require
correspondence label for the scans. We demonstrate the superior dense
correspondence and representation power of our proposed method, and its
contribution to single-image 3D face reconstruction.Comment: to appear in ICCV 201
- …