560 research outputs found
FML: Face Model Learning from Videos
Monocular image-based 3D reconstruction of faces is a long-standing problem
in computer vision. Since image data is a 2D projection of a 3D face, the
resulting depth ambiguity makes the problem ill-posed. Most existing methods
rely on data-driven priors that are built from limited 3D face scans. In
contrast, we propose multi-frame video-based self-supervised training of a deep
network that (i) learns a face identity model both in shape and appearance
while (ii) jointly learning to reconstruct 3D faces. Our face model is learned
using only corpora of in-the-wild video clips collected from the Internet. This
virtually endless source of training data enables learning of a highly general
3D face model. In order to achieve this, we propose a novel multi-frame
consistency loss that ensures consistent shape and appearance across multiple
frames of a subject's face, thus minimizing depth ambiguity. At test time we
can use an arbitrary number of frames, so that we can perform both monocular as
well as multi-frame reconstruction.Comment: CVPR 2019 (Oral). Video: https://www.youtube.com/watch?v=SG2BwxCw0lQ,
Project Page: https://gvv.mpi-inf.mpg.de/projects/FML19
k-Same-Siamese-GAN: k-Same Algorithm with Generative Adversarial Network for Facial Image De-identification with Hyperparameter Tuning and Mixed Precision Training
For a data holder, such as a hospital or a government entity, who has a
privately held collection of personal data, in which the revealing and/or
processing of the personal identifiable data is restricted and prohibited by
law. Then, "how can we ensure the data holder does conceal the identity of each
individual in the imagery of personal data while still preserving certain
useful aspects of the data after de-identification?" becomes a challenge issue.
In this work, we propose an approach towards high-resolution facial image
de-identification, called k-Same-Siamese-GAN, which leverages the
k-Same-Anonymity mechanism, the Generative Adversarial Network, and the
hyperparameter tuning methods. Moreover, to speed up model training and reduce
memory consumption, the mixed precision training technique is also applied to
make kSS-GAN provide guarantees regarding privacy protection on close-form
identities and be trained much more efficiently as well. Finally, to validate
its applicability, the proposed work has been applied to actual datasets - RafD
and CelebA for performance testing. Besides protecting privacy of
high-resolution facial images, the proposed system is also justified for its
ability in automating parameter tuning and breaking through the limitation of
the number of adjustable parameters
Signature Verification Using Siamese Convolutional Neural Networks
This research entails the processes undergone in building a Siamese Neural Network for Signature Verification. This Neural Network which uses two similar base neural networks as its underlying architecture was built, trained and evaluated in this project. The base networks were made up of two similar convolutional neural networks sharing the same weights during training. The architecture commonly known as the Siamese network helped reduce the amount of training data needed for its implementation and thus increased the model’s efficiency by 13%. The convolutional network was made up of three convolutional layers, three pooling layers and one fully connected layer onto which the final results were passed to the contrastive loss function for comparison. A threshold function determined if the signatures were forged or not. An accuracy of 78% initially achieved led to the tweaking and improvement of the model to achieve a better prediction accuracy of 93%
A Taxonomy of Deep Convolutional Neural Nets for Computer Vision
Traditional architectures for solving computer vision problems and the degree
of success they enjoyed have been heavily reliant on hand-crafted features.
However, of late, deep learning techniques have offered a compelling
alternative -- that of automatically learning problem-specific features. With
this new paradigm, every problem in computer vision is now being re-examined
from a deep learning perspective. Therefore, it has become important to
understand what kind of deep networks are suitable for a given problem.
Although general surveys of this fast-moving paradigm (i.e. deep-networks)
exist, a survey specific to computer vision is missing. We specifically
consider one form of deep networks widely used in computer vision -
convolutional neural networks (CNNs). We start with "AlexNet" as our base CNN
and then examine the broad variations proposed over time to suit different
applications. We hope that our recipe-style survey will serve as a guide,
particularly for novice practitioners intending to use deep-learning techniques
for computer vision.Comment: Published in Frontiers in Robotics and AI (http://goo.gl/6691Bm
LEED: Label-Free Expression Editing via Disentanglement
Recent studies on facial expression editing have obtained very promising
progress. On the other hand, existing methods face the constraint of requiring
a large amount of expression labels which are often expensive and
time-consuming to collect. This paper presents an innovative label-free
expression editing via disentanglement (LEED) framework that is capable of
editing the expression of both frontal and profile facial images without
requiring any expression label. The idea is to disentangle the identity and
expression of a facial image in the expression manifold, where the neutral face
captures the identity attribute and the displacement between the neutral image
and the expressive image captures the expression attribute. Two novel losses
are designed for optimal expression disentanglement and consistent synthesis,
including a mutual expression information loss that aims to extract pure
expression-related features and a siamese loss that aims to enhance the
expression similarity between the synthesized image and the reference image.
Extensive experiments over two public facial expression datasets show that LEED
achieves superior facial expression editing qualitatively and quantitatively.Comment: Accepted to ECCV 202
- …