27,848 research outputs found
Multiple Face Analyses through Adversarial Learning
This inherent relations among multiple face analysis tasks, such as landmark
detection, head pose estimation, gender recognition and face attribute
estimation are crucial to boost the performance of each task, but have not been
thoroughly explored since typically these multiple face analysis tasks are
handled as separate tasks. In this paper, we propose a novel deep multi-task
adversarial learning method to localize facial landmark, estimate head pose and
recognize gender jointly or estimate multiple face attributes simultaneously
through exploring their dependencies from both image representation-level and
label-level. Specifically, the proposed method consists of a deep recognition
network R and a discriminator D. The deep recognition network is used to learn
the shared middle-level image representation and conducts multiple face
analysis tasks simultaneously. Through multi-task learning mechanism, the
recognition network explores the dependencies among multiple face analysis
tasks, such as facial landmark localization, head pose estimation, gender
recognition and face attribute estimation from image representation-level. The
discriminator is introduced to enforce the distribution of the multiple face
analysis tasks to converge to that inherent in the ground-truth labels. During
training, the recognizer tries to confuse the discriminator, while the
discriminator competes with the recognizer through distinguishing the predicted
label combination from the ground-truth one. Though adversarial learning, we
explore the dependencies among multiple face analysis tasks from label-level.
Experimental results on four benchmark databases, i.e., the AFLW database, the
Multi-PIE database, the CelebA database and the LFWA database, demonstrate the
effectiveness of the proposed method for multiple face analyses
A Survey of Deep Facial Attribute Analysis
Facial attribute analysis has received considerable attention when deep
learning techniques made remarkable breakthroughs in this field over the past
few years. Deep learning based facial attribute analysis consists of two basic
sub-issues: facial attribute estimation (FAE), which recognizes whether facial
attributes are present in given images, and facial attribute manipulation
(FAM), which synthesizes or removes desired facial attributes. In this paper,
we provide a comprehensive survey of deep facial attribute analysis from the
perspectives of both estimation and manipulation. First, we summarize a general
pipeline that deep facial attribute analysis follows, which comprises two
stages: data preprocessing and model construction. Additionally, we introduce
the underlying theories of this two-stage pipeline for both FAE and FAM.
Second, the datasets and performance metrics commonly used in facial attribute
analysis are presented. Third, we create a taxonomy of state-of-the-art methods
and review deep FAE and FAM algorithms in detail. Furthermore, several
additional facial attribute related issues are introduced, as well as relevant
real-world applications. Finally, we discuss possible challenges and promising
future research directions.Comment: submitted to International Journal of Computer Vision (IJCV
Unified Adversarial Invariance
We present a unified invariance framework for supervised neural networks that
can induce independence to nuisance factors of data without using any nuisance
annotations, but can additionally use labeled information about biasing factors
to force their removal from the latent embedding for making fair predictions.
Invariance to nuisance is achieved by learning a split representation of data
through competitive training between the prediction task and a reconstruction
task coupled with disentanglement, whereas that to biasing factors is brought
about by penalizing the network if the latent embedding contains any
information about them. We describe an adversarial instantiation of this
framework and provide analysis of its working. Our model outperforms previous
works at inducing invariance to nuisance factors without using any labeled
information about such variables, and achieves state-of-the-art performance at
learning independence to biasing factors in fairness settings.Comment: In submission to T-PAMI. Some results updated. arXiv admin note:
substantial text overlap with arXiv:1809.1008
Toward Learning a Unified Many-to-Many Mapping for Diverse Image Translation
Image-to-image translation, which translates input images to a different
domain with a learned one-to-one mapping, has achieved impressive success in
recent years. The success of translation mainly relies on the network
architecture to reserve the structural information while modify the appearance
slightly at the pixel level through adversarial training. Although these
networks are able to learn the mapping, the translated images are predictable
without exclusion. It is more desirable to diversify them using image-to-image
translation by introducing uncertainties, i.e., the generated images hold
potential for variations in colors and textures in addition to the general
similarity to the input images, and this happens in both the target and source
domains. To this end, we propose a novel generative adversarial network (GAN)
based model, InjectionGAN, to learn a many-to-many mapping. In this model, the
input image is combined with latent variables, which comprise of
domain-specific attribute and unspecific random variations. The domain-specific
attribute indicates the target domain of the translation, while the unspecific
random variations introduce uncertainty into the model. A unified framework is
proposed to regroup these two parts and obtain diverse generations in each
domain. Extensive experiments demonstrate that the diverse generations have
high quality for the challenging image-to-image translation tasks where no
pairing information of the training dataset exits. Both quantitative and
qualitative results prove the superior performance of InjectionGAN over the
state-of-the-art approaches
Attribute-Guided Sketch Generation
Facial attributes are important since they provide a detailed description and
determine the visual appearance of human faces. In this paper, we aim at
converting a face image to a sketch while simultaneously generating facial
attributes. To this end, we propose a novel Attribute-Guided Sketch Generative
Adversarial Network (ASGAN) which is an end-to-end framework and contains two
pairs of generators and discriminators, one of which is used to generate faces
with attributes while the other one is employed for image-to-sketch
translation. The two generators form a W-shaped network (W-net) and they are
trained jointly with a weight-sharing constraint. Additionally, we also propose
two novel discriminators, the residual one focusing on attribute generation and
the triplex one helping to generate realistic looking sketches. To validate our
model, we have created a new large dataset with 8,804 images, named the
Attribute Face Photo & Sketch (AFPS) dataset which is the first dataset
containing attributes associated to face sketch images. The experimental
results demonstrate that the proposed network (i) generates more
photo-realistic faces with sharper facial attributes than baselines and (ii)
has good generalization capability on different generative tasks.Comment: 7 pages, 6 figures, accepted to FG 201
Semantic Adversarial Attacks: Parametric Transformations That Fool Deep Classifiers
Deep neural networks have been shown to exhibit an intriguing vulnerability
to adversarial input images corrupted with imperceptible perturbations.
However, the majority of adversarial attacks assume global, fine-grained
control over the image pixel space. In this paper, we consider a different
setting: what happens if the adversary could only alter specific attributes of
the input image? These would generate inputs that might be perceptibly
different, but still natural-looking and enough to fool a classifier. We
propose a novel approach to generate such `semantic' adversarial examples by
optimizing a particular adversarial loss over the range-space of a parametric
conditional generative model. We demonstrate implementations of our attacks on
binary classifiers trained on face images, and show that such natural-looking
semantic adversarial examples exist. We evaluate the effectiveness of our
attack on synthetic and real data, and present detailed comparisons with
existing attack methods. We supplement our empirical results with theoretical
bounds that demonstrate the existence of such parametric adversarial examples.Comment: Accepted to International Conference on Computer Vision, (ICCV) 201
Geometry-Contrastive GAN for Facial Expression Transfer
In this paper, we propose a Geometry-Contrastive Generative Adversarial
Network (GC-GAN) for transferring continuous emotions across different
subjects. Given an input face with certain emotion and a target facial
expression from another subject, GC-GAN can generate an identity-preserving
face with the target expression. Geometry information is introduced into cGANs
as continuous conditions to guide the generation of facial expressions. In
order to handle the misalignment across different subjects or emotions,
contrastive learning is used to transform geometry manifold into an embedded
semantic manifold of facial expressions. Therefore, the embedded geometry is
injected into the latent space of GANs and control the emotion generation
effectively. Experimental results demonstrate that our proposed method can be
applied in facial expression transfer even there exist big differences in
facial shapes and expressions between different subjects
Deep adversarial neural decoding
Here, we present a novel approach to solve the problem of reconstructing
perceived stimuli from brain responses by combining probabilistic inference
with deep learning. Our approach first inverts the linear transformation from
latent features to brain responses with maximum a posteriori estimation and
then inverts the nonlinear transformation from perceived stimuli to latent
features with adversarial training of convolutional neural networks. We test
our approach with a functional magnetic resonance imaging experiment and show
that it can generate state-of-the-art reconstructions of perceived faces from
brain activations.Comment: Added appendix and updated figure
Physical Adversarial Textures that Fool Visual Object Tracking
We present a system for generating inconspicuous-looking textures that, when
displayed in the physical world as digital or printed posters, cause visual
object tracking systems to become confused. For instance, as a target being
tracked by a robot's camera moves in front of such a poster, our generated
texture makes the tracker lock onto it and allows the target to evade. This
work aims to fool seldom-targeted regression tasks, and in particular compares
diverse optimization strategies: non-targeted, targeted, and a new family of
guided adversarial losses. While we use the Expectation Over Transformation
(EOT) algorithm to generate physical adversaries that fool tracking models when
imaged under diverse conditions, we compare the impacts of different
conditioning variables, including viewpoint, lighting, and appearances, to find
practical attack setups with high resulting adversarial strength and
convergence speed. We further showcase textures optimized solely using
simulated scenes can confuse real-world tracking systems.Comment: Accepted to the International Conference on Computer Vision (ICCV)
201
Learning Continuous Face Age Progression: A Pyramid of GANs
The two underlying requirements of face age progression, i.e. aging accuracy
and identity permanence, are not well studied in the literature. This paper
presents a novel generative adversarial network based approach to address the
issues in a coupled manner. It separately models the constraints for the
intrinsic subject-specific characteristics and the age-specific facial changes
with respect to the elapsed time, ensuring that the generated faces present
desired aging effects while simultaneously keeping personalized properties
stable. To ensure photo-realistic facial details, high-level age-specific
features conveyed by the synthesized face are estimated by a pyramidal
adversarial discriminator at multiple scales, which simulates the aging effects
with finer details. Further, an adversarial learning scheme is introduced to
simultaneously train a single generator and multiple parallel discriminators,
resulting in smooth continuous face aging sequences. The proposed method is
applicable even in the presence of variations in pose, expression, makeup,
etc., achieving remarkably vivid aging effects. Quantitative evaluations by a
COTS face recognition system demonstrate that the target age distributions are
accurately recovered, and 99.88% and 99.98% age progressed faces can be
correctly verified at 0.001% FAR after age transformations of approximately 28
and 23 years elapsed time on the MORPH and CACD databases, respectively. Both
visual and quantitative assessments show that the approach advances the
state-of-the-art.Comment: arXiv admin note: substantial text overlap with arXiv:1711.1035
- …