10,092 research outputs found
Empirically Analyzing the Effect of Dataset Biases on Deep Face Recognition Systems
It is unknown what kind of biases modern in the wild face datasets have
because of their lack of annotation. A direct consequence of this is that total
recognition rates alone only provide limited insight about the generalization
ability of a Deep Convolutional Neural Networks (DCNNs). We propose to
empirically study the effect of different types of dataset biases on the
generalization ability of DCNNs. Using synthetically generated face images, we
study the face recognition rate as a function of interpretable parameters such
as face pose and light. The proposed method allows valuable details about the
generalization performance of different DCNN architectures to be observed and
compared. In our experiments, we find that: 1) Indeed, dataset bias has a
significant influence on the generalization performance of DCNNs. 2) DCNNs can
generalize surprisingly well to unseen illumination conditions and large
sampling gaps in the pose variation. 3) Using the presented methodology we
reveal that the VGG-16 architecture outperforms the AlexNet architecture at
face recognition tasks because it can much better generalize to unseen face
poses, although it has significantly more parameters. 4) We uncover a main
limitation of current DCNN architectures, which is the difficulty to generalize
when different identities to not share the same pose variation. 5) We
demonstrate that our findings on synthetic data also apply when learning from
real-world data. Our face image generator is publicly available to enable the
community to benchmark other DCNN architectures.Comment: Accepted to CVPR 2018 Workshop on Analysis and Modeling of Faces and
Gestures (AMFG
Learned-Norm Pooling for Deep Feedforward and Recurrent Neural Networks
In this paper we propose and investigate a novel nonlinear unit, called
unit, for deep neural networks. The proposed unit receives signals from
several projections of a subset of units in the layer below and computes a
normalized norm. We notice two interesting interpretations of the
unit. First, the proposed unit can be understood as a generalization of a
number of conventional pooling operators such as average, root-mean-square and
max pooling widely used in, for instance, convolutional neural networks (CNN),
HMAX models and neocognitrons. Furthermore, the unit is, to a certain
degree, similar to the recently proposed maxout unit (Goodfellow et al., 2013)
which achieved the state-of-the-art object recognition results on a number of
benchmark datasets. Secondly, we provide a geometrical interpretation of the
activation function based on which we argue that the unit is more
efficient at representing complex, nonlinear separating boundaries. Each
unit defines a superelliptic boundary, with its exact shape defined by the
order . We claim that this makes it possible to model arbitrarily shaped,
curved boundaries more efficiently by combining a few units of different
orders. This insight justifies the need for learning different orders for each
unit in the model. We empirically evaluate the proposed units on a number
of datasets and show that multilayer perceptrons (MLP) consisting of the
units achieve the state-of-the-art results on a number of benchmark datasets.
Furthermore, we evaluate the proposed unit on the recently proposed deep
recurrent neural networks (RNN).Comment: ECML/PKDD 201
Analyzing and Reducing the Damage of Dataset Bias to Face Recognition With Synthetic Data
It is well known that deep learning approaches to facerecognition suffer from various biases in the available train-ing data. In this work, we demonstrate the large potentialof synthetic data for analyzing and reducing the negativeeffects of dataset bias on deep face recognition systems. Inparticular we explore two complementary application areasfor synthetic face images: 1) Using fully annotated syntheticface images we can study the face recognition rate as afunction of interpretable parameters such as face pose. Thisenables us to systematically analyze the effect of differenttypes of dataset biases on the generalization ability of neu-ral network architectures. Our analysis reveals that deeperneural network architectures can generalize better to un-seen face poses. Furthermore, our study shows that currentneural network architectures cannot disentangle face poseand facial identity, which limits their generalization ability.2) We pre-train neural networks with large-scale syntheticdata that is highly variable in face pose and the number offacial identities. After a subsequent fine-tuning with real-world data, we observe that the damage of dataset bias inthe real-world data is largely reduced. Furthermore, wedemonstrate that the size of real-world datasets can be re-duced by 75% while maintaining competitive face recogni-tion performance. The data and software used in this workare publicly available
First impressions: A survey on vision-based apparent personality trait analysis
© 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes,creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Personality analysis has been widely studied in psychology, neuropsychology, and signal processing fields, among others. From the past few years, it also became an attractive research area in visual computing. From the computational point of view, by far speech and text have been the most considered cues of information for analyzing personality. However, recently there has been an increasing interest from the computer vision community in analyzing personality from visual data. Recent computer vision approaches are able to accurately analyze human faces, body postures and behaviors, and use these information to infer apparent personality traits. Because of the overwhelming research interest in this topic, and of the potential impact that this sort of methods could have in society, we present in this paper an up-to-date review of existing vision-based approaches for apparent personality trait recognition. We describe seminal and cutting edge works on the subject, discussing and comparing their distinctive features and limitations. Future venues of research in the field are identified and discussed. Furthermore, aspects on the subjectivity in data labeling/evaluation, as well as current datasets and challenges organized to push the research on the field are reviewed.Peer ReviewedPostprint (author's final draft
SEAN: Image Synthesis with Semantic Region-Adaptive Normalization
We propose semantic region-adaptive normalization (SEAN), a simple but
effective building block for Generative Adversarial Networks conditioned on
segmentation masks that describe the semantic regions in the desired output
image. Using SEAN normalization, we can build a network architecture that can
control the style of each semantic region individually, e.g., we can specify
one style reference image per region. SEAN is better suited to encode,
transfer, and synthesize style than the best previous method in terms of
reconstruction quality, variability, and visual quality. We evaluate SEAN on
multiple datasets and report better quantitative metrics (e.g. FID, PSNR) than
the current state of the art. SEAN also pushes the frontier of interactive
image editing. We can interactively edit images by changing segmentation masks
or the style for any given region. We can also interpolate styles from two
reference images per region.Comment: Accepted as a CVPR 2020 oral paper. The interactive demo is available
at https://youtu.be/0Vbj9xFgoU
- …