35,007 research outputs found
Face recognition with the RGB-D sensor
Face recognition in unconstrained environments is still a challenge, because of the many variations of the facial appearance due to changes in head pose, lighting conditions, facial expression, age, etc. This work addresses the problem of face recognition in the presence of 2D facial appearance variations caused by 3D head rotations. It explores the advantages of the recently developed consumer-level RGB-D cameras (e.g. Kinect). These cameras provide color and depth images at the same rate. They are affordable and easy to use, but the depth images are noisy and in low resolution, unlike laser scanned depth images. The proposed approach to face recognition is able to deal with large head pose variations using RGB-D face images. The method uses the depth information to correct the pose of the face. It does not need to learn a generic face model or make complex 3D-2D registrations. It is simple and fast, yet able to deal with large pose variations and perform pose-invariant face recognition. Experiments on a public database show that the presented approach is effective and efficient under significant pose changes. Also, the idea is used to develop a face recognition software that is able to achieve real-time face recognition in the presence of large yaw rotations using the Kinect sensor. It is shown in real-time how this method improves recognition accuracy and confidence level. This study demonstrates that RGB-D sensors are a promising tool that can lead to the development of robust pose-invariant face recognition systems under large pose variations
Facial emotion recognition using min-max similarity classifier
Recognition of human emotions from the imaging templates is useful in a wide
variety of human-computer interaction and intelligent systems applications.
However, the automatic recognition of facial expressions using image template
matching techniques suffer from the natural variability with facial features
and recording conditions. In spite of the progress achieved in facial emotion
recognition in recent years, the effective and computationally simple feature
selection and classification technique for emotion recognition is still an open
problem. In this paper, we propose an efficient and straightforward facial
emotion recognition algorithm to reduce the problem of inter-class pixel
mismatch during classification. The proposed method includes the application of
pixel normalization to remove intensity offsets followed-up with a Min-Max
metric in a nearest neighbor classifier that is capable of suppressing feature
outliers. The results indicate an improvement of recognition performance from
92.85% to 98.57% for the proposed Min-Max classification method when tested on
JAFFE database. The proposed emotion recognition technique outperforms the
existing template matching methods
Fast and Accurate 3D Face Recognition Using Registration to an Intrinsic Coordinate System and Fusion of Multiple Region classifiers
In this paper we present a new robust approach for 3D face registration to an intrinsic coordinate system of the face. The intrinsic coordinate system is defined by the vertical symmetry plane through the nose, the tip of the nose and the slope of the bridge of the nose. In addition, we propose a 3D face classifier based on the fusion of many dependent region classifiers for overlapping face regions. The region classifiers use PCA-LDA for feature extraction and the likelihood ratio as a matching score. Fusion is realised using straightforward majority voting for the identification scenario. For verification, a voting approach is used as well and the decision is defined by comparing the number of votes to a threshold. Using the proposed registration method combined with a classifier consisting of 60 fused region classifiers we obtain a 99.0% identification rate on the all vs first identification test of the FRGC v2 data. A verification rate of 94.6% at FAR=0.1% was obtained for the all vs all verification test on the FRGC v2 data using fusion of 120 region classifiers. The first is the highest reported performance and the second is in the top-5 of best performing systems on these tests. In addition, our approach is much faster than other methods, taking only 2.5 seconds per image for registration and less than 0.1 ms per comparison. Because we apply feature extraction using PCA and LDA, the resulting template size is also very small: 6 kB for 60 region classifiers
Generative Cooperative Net for Image Generation and Data Augmentation
How to build a good model for image generation given an abstract concept is a
fundamental problem in computer vision. In this paper, we explore a generative
model for the task of generating unseen images with desired features. We
propose the Generative Cooperative Net (GCN) for image generation. The idea is
similar to generative adversarial networks except that the generators and
discriminators are trained to work accordingly. Our experiments on hand-written
digit generation and facial expression generation show that GCN's two
cooperative counterparts (the generator and the classifier) can work together
nicely and achieve promising results. We also discovered a usage of such
generative model as an data-augmentation tool. Our experiment of applying this
method on a recognition task shows that it is very effective comparing to other
existing methods. It is easy to set up and could help generate a very large
synthesized dataset.Comment: 12 pages, 8 figure
Generating 3D faces using Convolutional Mesh Autoencoders
Learned 3D representations of human faces are useful for computer vision
problems such as 3D face tracking and reconstruction from images, as well as
graphics applications such as character generation and animation. Traditional
models learn a latent representation of a face using linear subspaces or
higher-order tensor generalizations. Due to this linearity, they can not
capture extreme deformations and non-linear expressions. To address this, we
introduce a versatile model that learns a non-linear representation of a face
using spectral convolutions on a mesh surface. We introduce mesh sampling
operations that enable a hierarchical mesh representation that captures
non-linear variations in shape and expression at multiple scales within the
model. In a variational setting, our model samples diverse realistic 3D faces
from a multivariate Gaussian distribution. Our training data consists of 20,466
meshes of extreme expressions captured over 12 different subjects. Despite
limited training data, our trained model outperforms state-of-the-art face
models with 50% lower reconstruction error, while using 75% fewer parameters.
We also show that, replacing the expression space of an existing
state-of-the-art face model with our autoencoder, achieves a lower
reconstruction error. Our data, model and code are available at
http://github.com/anuragranj/com
Fast Face-swap Using Convolutional Neural Networks
We consider the problem of face swapping in images, where an input identity
is transformed into a target identity while preserving pose, facial expression,
and lighting. To perform this mapping, we use convolutional neural networks
trained to capture the appearance of the target identity from an unstructured
collection of his/her photographs.This approach is enabled by framing the face
swapping problem in terms of style transfer, where the goal is to render an
image in the style of another one. Building on recent advances in this area, we
devise a new loss function that enables the network to produce highly
photorealistic results. By combining neural networks with simple pre- and
post-processing steps, we aim at making face swap work in real-time with no
input from the user
- ā¦