51 research outputs found
Exploring the Connection between Robust and Generative Models
We offer a study that connects robust discriminative classifiers trained with
adversarial training (AT) with generative modeling in the form of Energy-based
Models (EBM). We do so by decomposing the loss of a discriminative classifier
and showing that the discriminative model is also aware of the input data
density. Though a common assumption is that adversarial points leave the
manifold of the input data, our study finds out that, surprisingly, untargeted
adversarial points in the input space are very likely under the generative
model hidden inside the discriminative classifier -- have low energy in the
EBM. We present two evidence: untargeted attacks are even more likely than the
natural data and their likelihood increases as the attack strength increases.
This allows us to easily detect them and craft a novel attack called
High-Energy PGD that fools the classifier yet has energy similar to the data
set.Comment: technical report, 6 pages, 6 figure
On Face Segmentation, Face Swapping, and Face Perception
We show that even when face images are unconstrained and arbitrarily paired,
face swapping between them is actually quite simple. To this end, we make the
following contributions. (a) Instead of tailoring systems for face
segmentation, as others previously proposed, we show that a standard fully
convolutional network (FCN) can achieve remarkably fast and accurate
segmentations, provided that it is trained on a rich enough example set. For
this purpose, we describe novel data collection and generation routines which
provide challenging segmented face examples. (b) We use our segmentations to
enable robust face swapping under unprecedented conditions. (c) Unlike previous
work, our swapping is robust enough to allow for extensive quantitative tests.
To this end, we use the Labeled Faces in the Wild (LFW) benchmark and measure
the effect of intra- and inter-subject face swapping on recognition. We show
that our intra-subject swapped faces remain as recognizable as their sources,
testifying to the effectiveness of our method. In line with well known
perceptual studies, we show that better face swapping produces less
recognizable inter-subject results. This is the first time this effect was
quantitatively demonstrated for machine vision systems
Extreme 3D Face Reconstruction: Seeing Through Occlusions
Existing single view, 3D face reconstruction methods can produce beautifully
detailed 3D results, but typically only for near frontal, unobstructed
viewpoints. We describe a system designed to provide detailed 3D
reconstructions of faces viewed under extreme conditions, out of plane
rotations, and occlusions. Motivated by the concept of bump mapping, we propose
a layered approach which decouples estimation of a global shape from its
mid-level details (e.g., wrinkles). We estimate a coarse 3D face shape which
acts as a foundation and then separately layer this foundation with details
represented by a bump map. We show how a deep convolutional encoder-decoder can
be used to estimate such bump maps. We further show how this approach naturally
extends to generate plausible details for occluded facial regions. We test our
approach and its components extensively, quantitatively demonstrating the
invariance of our estimated facial details. We further provide numerous
qualitative examples showing that our method produces detailed 3D face shapes
in viewing conditions where existing state of the art often break down.Comment: Accepted to CVPR'18. Previously titled: "Extreme 3D Face
Reconstruction: Looking Past Occlusions
Pooling Faces: Template based Face Recognition with Pooled Face Images
We propose a novel approach to template based face recognition. Our dual goal
is to both increase recognition accuracy and reduce the computational and
storage costs of template matching. To do this, we leverage on an approach
which was proven effective in many other domains, but, to our knowledge, never
fully explored for face images: average pooling of face photos. We show how
(and why!) the space of a template's images can be partitioned and then pooled
based on image quality and head pose and the effect this has on accuracy and
template size. We perform extensive tests on the IJB-A and Janus CS2 template
based face identification and verification benchmarks. These show that not only
does our approach outperform published state of the art despite requiring far
fewer cross template comparisons, but also, surprisingly, that image pooling
performs on par with deep feature pooling.Comment: Appeared in the IEEE Computer Society Workshop on Biometrics, IEEE
Conf. on Computer Vision and Pattern Recognition (CVPR), June, 201
- …