323 research outputs found
MOON: A Mixed Objective Optimization Network for the Recognition of Facial Attributes
Attribute recognition, particularly facial, extracts many labels for each
image. While some multi-task vision problems can be decomposed into separate
tasks and stages, e.g., training independent models for each task, for a
growing set of problems joint optimization across all tasks has been shown to
improve performance. We show that for deep convolutional neural network (DCNN)
facial attribute extraction, multi-task optimization is better. Unfortunately,
it can be difficult to apply joint optimization to DCNNs when training data is
imbalanced, and re-balancing multi-label data directly is structurally
infeasible, since adding/removing data to balance one label will change the
sampling of the other labels. This paper addresses the multi-label imbalance
problem by introducing a novel mixed objective optimization network (MOON) with
a loss function that mixes multiple task objectives with domain adaptive
re-weighting of propagated loss. Experiments demonstrate that not only does
MOON advance the state of the art in facial attribute recognition, but it also
outperforms independently trained DCNNs using the same data. When using facial
attributes for the LFW face recognition task, we show that our balanced (domain
adapted) network outperforms the unbalanced trained network.Comment: Post-print of manuscript accepted to the European Conference on
Computer Vision (ECCV) 2016
http://link.springer.com/chapter/10.1007%2F978-3-319-46454-1_
Fast Face Detector Training Using Tailored Views
Face detection is an important task in computer vision and often serves as the first step for a variety of applications. State-of-the-art approaches use efficient learning algorithms and train on large amounts of manually labeled imagery. Acquiring appropriate training images, however, is very time-consuming and does not guarantee that the collected training data is representative in terms of data variability. Moreover, available data sets are often acquired under con-trolled settings, restricting, for example, scene illumination or 3D head pose to a narrow range. This paper takes a look into the automated generation of adaptive training samples from a 3D morphable face model. Using statistical insights, the tailored training data guarantees full data variability and is enriched by arbitrary facial attributes such as age or body weight. Moreover, it can automatically adapt to environmental constraints, such as illumination or viewing angle of recorded video footage from surveillance cameras. We use the tailored imagery to train a new many-core imple-mentation of Viola Jones ’ AdaBoost object detection frame-work. The new implementation is not only faster but also enables the use of multiple feature channels such as color features at training time. In our experiments we trained seven view-dependent face detectors and evaluate these on the Face Detection Data Set and Benchmark (FDDB). Our experiments show that the use of tailored training imagery outperforms state-of-the-art approaches on this challenging dataset. 1
Reflectance from images: a model-based approach for human faces
In this paper, we present an image-based framework that acquires the reflectance properties of a human face. A range scan of the face is not required. Based on a morphable face model, the system estimates the 3D shape, and establishes point-to-point correspondence across images taken from different viewpoints, and across different individuals' faces. This provides a common parameterization of all reconstructed surfaces that can be used to compare and transfer BRDF data between different faces. Shape estimation from images compensates deformations of the face during the measurement process, such as facial expressions. In the common parameterization, regions of homogeneous materials on the face surface can be defined a-priori. We apply analytical BRDF models to express the reflectance properties of each region, and we estimate their parameters in a least-squares fit from the image data. For each of the surface points, the diffuse component of the BRDF is locally refined, which provides high detail. We present results for multiple analytical BRDF models, rendered at novelorientations and lighting conditions
Removing pose from face images
This paper proposes a novel approach to pose removal from face images based on the inherent symmetry that is present in faces. In order for face recognition systems and expression classification systems to operate optimally, subjects must look directly into the camera. The removal of pose from face images after their capture removes this restriction. To obtain a pose-removed face image, the frequency components at each position of the face image, obtained through a wavelet transformation, are examined. A cost function based on the symmetry of this wavelet transformed face image is minimized to achieve pose removal.Experimental results are presented that demonstrate that the proposed algorithm improves upon existing techniques in the literature
Fitting a 3D Morphable Model to Edges: A Comparison Between Hard and Soft Correspondences
We propose a fully automatic method for fitting a 3D morphable model to
single face images in arbitrary pose and lighting. Our approach relies on
geometric features (edges and landmarks) and, inspired by the iterated closest
point algorithm, is based on computing hard correspondences between model
vertices and edge pixels. We demonstrate that this is superior to previous work
that uses soft correspondences to form an edge-derived cost surface that is
minimised by nonlinear optimisation.Comment: To appear in ACCV 2016 Workshop on Facial Informatic
Automatic 3D facial model and texture reconstruction from range scans
This paper presents a fully automatic approach to fitting a generic facial model to detailed range scans of human faces to reconstruct 3D facial models and textures with no manual intervention (such as specifying landmarks). A Scaling Iterative Closest Points (SICP) algorithm is introduced to compute the optimal rigid registrations between the generic model and the range scans with different sizes. And then a new template-fitting method, formulated in an optmization framework of minimizing the physically based elastic energy derived from thin shells, faithfully reconstructs the surfaces and the textures from the range scans and yields dense point correspondences across the reconstructed facial models. Finally, we demonstrate a facial expression transfer method to clone facial expressions from the generic model onto the reconstructed facial models by using the deformation transfer technique
{3D} Morphable Face Models -- Past, Present and Future
In this paper, we provide a detailed survey of 3D Morphable Face Models over the 20 years since they were first proposed. The challenges in building and applying these models, namely capture, modeling, image formation, and image analysis, are still active research topics, and we review the state-of-the-art in each of these areas. We also look ahead, identifying unsolved challenges, proposing directions for future research and highlighting the broad range of current and future applications
FSNet: An Identity-Aware Generative Model for Image-based Face Swapping
This paper presents FSNet, a deep generative model for image-based face
swapping. Traditionally, face-swapping methods are based on three-dimensional
morphable models (3DMMs), and facial textures are replaced between the
estimated three-dimensional (3D) geometries in two images of different
individuals. However, the estimation of 3D geometries along with different
lighting conditions using 3DMMs is still a difficult task. We herein represent
the face region with a latent variable that is assigned with the proposed deep
neural network (DNN) instead of facial textures. The proposed DNN synthesizes a
face-swapped image using the latent variable of the face region and another
image of the non-face region. The proposed method is not required to fit to the
3DMM; additionally, it performs face swapping only by feeding two face images
to the proposed network. Consequently, our DNN-based face swapping performs
better than previous approaches for challenging inputs with different face
orientations and lighting conditions. Through several experiments, we
demonstrated that the proposed method performs face swapping in a more stable
manner than the state-of-the-art method, and that its results are compatible
with the method thereof.Comment: 20pages, Asian Conference of Computer Vision 201
Towards Pose-Invariant 2D Face Classification for Surveillance
A key problem for "face in the crowd" recognition from existing surveillance cameras in public spaces (such as mass transit centres) is the issue of pose mismatches between probe and gallery faces. In addition to accuracy, scalability is also important, necessarily limiting the complexity of face classification algorithms. In this paper we evaluate recent approaches to the recognition of faces at relatively large pose angles from a gallery of frontal images and propose novel adaptations as well as modifications. Specifically, we compare and contrast the accuracy, robustness and speed of an Active Appearance Model (AAM) based method (where realistic frontal faces are synthesized from non-frontal probe faces) against bag-of-features methods (which are local feature approaches based on block Discrete Cosine Transforms and Gaussian Mixture Models). We show a novel approach where the AAM based technique is sped up by directly obtaining pose-robust features, allowing the omission of the computationally expensive and artefact producing image synthesis step. Additionally, we adapt a histogram-based bag-of-features technique to face classification and contrast its properties to a previously proposed direct bag-of-features method. We also show that the two bag-of-features approaches can be considerably sped up, without a loss in classification accuracy, via an approximation of the exponential function. Experiments on the FERET and PIE databases suggest that the bag-of-features techniques generally attain better performance, with significantly lower computational loads. The histogram-based bag-of-features technique is capable of achieving an average recognition accuracy of 89% for pose angles of around 25 degrees
- …