13,504 research outputs found
Synthesizing Normalized Faces from Facial Identity Features
We present a method for synthesizing a frontal, neutral-expression image of a
person's face given an input face photograph. This is achieved by learning to
generate facial landmarks and textures from features extracted from a
facial-recognition network. Unlike previous approaches, our encoding feature
vector is largely invariant to lighting, pose, and facial expression.
Exploiting this invariance, we train our decoder network using only frontal,
neutral-expression photographs. Since these photographs are well aligned, we
can decompose them into a sparse set of landmark points and aligned texture
maps. The decoder then predicts landmarks and textures independently and
combines them using a differentiable image warping operation. The resulting
images can be used for a number of applications, such as analyzing facial
attributes, exposure and white balance adjustment, or creating a 3-D avatar
Elimination of Glass Artifacts and Object Segmentation
Many images nowadays are captured from behind the glasses and may have
certain stains discrepancy because of glass and must be processed to make
differentiation between the glass and objects behind it. This research paper
proposes an algorithm to remove the damaged or corrupted part of the image and
make it consistent with other part of the image and to segment objects behind
the glass. The damaged part is removed using total variation inpainting method
and segmentation is done using kmeans clustering, anisotropic diffusion and
watershed transformation. The final output is obtained by interpolation. This
algorithm can be useful to applications in which some part of the images are
corrupted due to data transmission or needs to segment objects from an image
for further processing
Sparse optical flow regularisation for real-time visual tracking
Optical flow can greatly improve the robustness of visual tracking algorithms. While dense optical flow algorithms have various applications, they can not be used for real-time solutions without resorting to GPU calculations. Furthermore, most optical flow algorithms fail in challenging lighting environments due to the violation of the brightness constraint. We propose a simple but effective iterative regularisation scheme for real-time, sparse optical flow algorithms, that is shown to be robust to sudden illumination changes and can handle large displacements. The algorithm proves to outperform well known techniques in real life video sequences, while being much faster to calculate. Our solution increases the robustness of a real-time particle filter based tracking application, consuming only a fraction of the available CPU power. Furthermore, a new and realistic optical flow dataset with annotated ground truth is created and made freely available for research purposes
Interpretable Transformations with Encoder-Decoder Networks
Deep feature spaces have the capacity to encode complex transformations of
their input data. However, understanding the relative feature-space
relationship between two transformed encoded images is difficult. For instance,
what is the relative feature space relationship between two rotated images?
What is decoded when we interpolate in feature space? Ideally, we want to
disentangle confounding factors, such as pose, appearance, and illumination,
from object identity. Disentangling these is difficult because they interact in
very nonlinear ways. We propose a simple method to construct a deep feature
space, with explicitly disentangled representations of several known
transformations. A person or algorithm can then manipulate the disentangled
representation, for example, to re-render an image with explicit control over
parameterized degrees of freedom. The feature space is constructed using a
transforming encoder-decoder network with a custom feature transform layer,
acting on the hidden representations. We demonstrate the advantages of explicit
disentangling on a variety of datasets and transformations, and as an aid for
traditional tasks, such as classification.Comment: Accepted at ICCV 201
Nonlinear Supervised Dimensionality Reduction via Smooth Regular Embeddings
The recovery of the intrinsic geometric structures of data collections is an
important problem in data analysis. Supervised extensions of several manifold
learning approaches have been proposed in the recent years. Meanwhile, existing
methods primarily focus on the embedding of the training data, and the
generalization of the embedding to initially unseen test data is rather
ignored. In this work, we build on recent theoretical results on the
generalization performance of supervised manifold learning algorithms.
Motivated by these performance bounds, we propose a supervised manifold
learning method that computes a nonlinear embedding while constructing a smooth
and regular interpolation function that extends the embedding to the whole data
space in order to achieve satisfactory generalization. The embedding and the
interpolator are jointly learnt such that the Lipschitz regularity of the
interpolator is imposed while ensuring the separation between different
classes. Experimental results on several image data sets show that the proposed
method outperforms traditional classifiers and the supervised dimensionality
reduction algorithms in comparison in terms of classification accuracy in most
settings
Polar Fusion Technique Analysis for Evaluating the Performances of Image Fusion of Thermal and Visual Images for Human Face Recognition
This paper presents a comparative study of two different methods, which are
based on fusion and polar transformation of visual and thermal images. Here,
investigation is done to handle the challenges of face recognition, which
include pose variations, changes in facial expression, partial occlusions,
variations in illumination, rotation through different angles, change in scale
etc. To overcome these obstacles we have implemented and thoroughly examined
two different fusion techniques through rigorous experimentation. In the first
method log-polar transformation is applied to the fused images obtained after
fusion of visual and thermal images whereas in second method fusion is applied
on log-polar transformed individual visual and thermal images. After this step,
which is thus obtained in one form or another, Principal Component Analysis
(PCA) is applied to reduce dimension of the fused images. Log-polar transformed
images are capable of handling complicacies introduced by scaling and rotation.
The main objective of employing fusion is to produce a fused image that
provides more detailed and reliable information, which is capable to overcome
the drawbacks present in the individual visual and thermal face images.
Finally, those reduced fused images are classified using a multilayer
perceptron neural network. The database used for the experiments conducted here
is Object Tracking and Classification Beyond Visible Spectrum (OTCBVS) database
benchmark thermal and visual face images. The second method has shown better
performance, which is 95.71% (maximum) and on an average 93.81% as correct
recognition rate.Comment: Proceedings of IEEE Workshop on Computational Intelligence in
Biometrics and Identity Management (IEEE CIBIM 2011), Paris, France, April 11
- 15, 201
- …