914 research outputs found
L2-constrained Softmax Loss for Discriminative Face Verification
In recent years, the performance of face verification systems has
significantly improved using deep convolutional neural networks (DCNNs). A
typical pipeline for face verification includes training a deep network for
subject classification with softmax loss, using the penultimate layer output as
the feature descriptor, and generating a cosine similarity score given a pair
of face images. The softmax loss function does not optimize the features to
have higher similarity score for positive pairs and lower similarity score for
negative pairs, which leads to a performance gap. In this paper, we add an
L2-constraint to the feature descriptors which restricts them to lie on a
hypersphere of a fixed radius. This module can be easily implemented using
existing deep learning frameworks. We show that integrating this simple step in
the training pipeline significantly boosts the performance of face
verification. Specifically, we achieve state-of-the-art results on the
challenging IJB-A dataset, achieving True Accept Rate of 0.909 at False Accept
Rate 0.0001 on the face verification protocol. Additionally, we achieve
state-of-the-art performance on LFW dataset with an accuracy of 99.78%, and
competing performance on YTF dataset with accuracy of 96.08%
CosFace: Large Margin Cosine Loss for Deep Face Recognition
Face recognition has made extraordinary progress owing to the advancement of
deep convolutional neural networks (CNNs). The central task of face
recognition, including face verification and identification, involves face
feature discrimination. However, the traditional softmax loss of deep CNNs
usually lacks the power of discrimination. To address this problem, recently
several loss functions such as center loss, large margin softmax loss, and
angular softmax loss have been proposed. All these improved losses share the
same idea: maximizing inter-class variance and minimizing intra-class variance.
In this paper, we propose a novel loss function, namely large margin cosine
loss (LMCL), to realize this idea from a different perspective. More
specifically, we reformulate the softmax loss as a cosine loss by
normalizing both features and weight vectors to remove radial variations, based
on which a cosine margin term is introduced to further maximize the decision
margin in the angular space. As a result, minimum intra-class variance and
maximum inter-class variance are achieved by virtue of normalization and cosine
decision margin maximization. We refer to our model trained with LMCL as
CosFace. Extensive experimental evaluations are conducted on the most popular
public-domain face recognition datasets such as MegaFace Challenge, Youtube
Faces (YTF) and Labeled Face in the Wild (LFW). We achieve the state-of-the-art
performance on these benchmarks, which confirms the effectiveness of our
proposed approach.Comment: Accepted by CVPR 201
Large Margin Softmax Loss for Speaker Verification
In neural network based speaker verification, speaker embedding is expected
to be discriminative between speakers while the intra-speaker distance should
remain small. A variety of loss functions have been proposed to achieve this
goal. In this paper, we investigate the large margin softmax loss with
different configurations in speaker verification. Ring loss and minimum
hyperspherical energy criterion are introduced to further improve the
performance. Results on VoxCeleb show that our best system outperforms the
baseline approach by 15\% in EER, and by 13\%, 33\% in minDCF08 and minDCF10,
respectively.Comment: submitted to Interspeech 2019. The code and models have been release
AdaCos: Adaptively Scaling Cosine Logits for Effectively Learning Deep Face Representations
The cosine-based softmax losses and their variants achieve great success in
deep learning based face recognition. However, hyperparameter settings in these
losses have significant influences on the optimization path as well as the
final recognition performance. Manually tuning those hyperparameters heavily
relies on user experience and requires many training tricks. In this paper, we
investigate in depth the effects of two important hyperparameters of
cosine-based softmax losses, the scale parameter and angular margin parameter,
by analyzing how they modulate the predicted classification probability. Based
on these analysis, we propose a novel cosine-based softmax loss, AdaCos, which
is hyperparameter-free and leverages an adaptive scale parameter to
automatically strengthen the training supervisions during the training process.
We apply the proposed AdaCos loss to large-scale face verification and
identification datasets, including LFW, MegaFace, and IJB-C 1:1 Verification.
Our results show that training deep neural networks with the AdaCos loss is
stable and able to achieve high face recognition accuracy. Our method
outperforms state-of-the-art softmax losses on all the three datasets.Comment: CVPR 2019 Ora
von Mises-Fisher Mixture Model-based Deep learning: Application to Face Verification
A number of pattern recognition tasks, \textit{e.g.}, face verification, can
be boiled down to classification or clustering of unit length directional
feature vectors whose distance can be simply computed by their angle. In this
paper, we propose the von Mises-Fisher (vMF) mixture model as the theoretical
foundation for an effective deep-learning of such directional features and
derive a novel vMF Mixture Loss and its corresponding vMF deep features. The
proposed vMF feature learning achieves the characteristics of discriminative
learning, \textit{i.e.}, compacting the instances of the same class while
increasing the distance of instances from different classes. Moreover, it
subsumes a number of popular loss functions as well as an effective method in
deep learning, namely normalization. We conduct extensive experiments on face
verification using 4 different challenging face datasets, \textit{i.e.}, LFW,
YouTube faces, CACD and IJB-A. Results show the effectiveness and excellent
generalization ability of the proposed approach as it achieves state-of-the-art
results on the LFW, YouTube faces and CACD datasets and competitive results on
the IJB-A dataset.Comment: Under revie
Feature Incay for Representation Regularization
Softmax loss is widely used in deep neural networks for multi-class
classification, where each class is represented by a weight vector, a sample is
represented as a feature vector, and the feature vector has the largest
projection on the weight vector of the correct category when the model
correctly classifies a sample. To ensure generalization, weight decay that
shrinks the weight norm is often used as regularizer. Different from
traditional learning algorithms where features are fixed and only weights are
tunable, features are also tunable as representation learning in deep learning.
Thus, we propose feature incay to also regularize representation learning,
which favors feature vectors with large norm when the samples can be correctly
classified. With the feature incay, feature vectors are further pushed away
from the origin along the direction of their corresponding weight vectors,
which achieves better inter-class separability. In addition, the proposed
feature incay encourages intra-class compactness along the directions of weight
vectors by increasing the small feature norm faster than the large ones.
Empirical results on MNIST, CIFAR10 and CIFAR100 demonstrate feature incay can
improve the generalization ability
Neural Aggregation Network for Video Face Recognition
This paper presents a Neural Aggregation Network (NAN) for video face
recognition. The network takes a face video or face image set of a person with
a variable number of face images as its input, and produces a compact,
fixed-dimension feature representation for recognition. The whole network is
composed of two modules. The feature embedding module is a deep Convolutional
Neural Network (CNN) which maps each face image to a feature vector. The
aggregation module consists of two attention blocks which adaptively aggregate
the feature vectors to form a single feature inside the convex hull spanned by
them. Due to the attention mechanism, the aggregation is invariant to the image
order. Our NAN is trained with a standard classification or verification loss
without any extra supervision signal, and we found that it automatically learns
to advocate high-quality face images while repelling low-quality ones such as
blurred, occluded and improperly exposed faces. The experiments on IJB-A,
YouTube Face, Celebrity-1000 video face recognition benchmarks show that it
consistently outperforms naive aggregation methods and achieves the
state-of-the-art accuracy.Comment: Post CVPR2017 version with minor typo fi
Representation Learning by Rotating Your Faces
The large pose discrepancy between two face images is one of the fundamental
challenges in automatic face recognition. Conventional approaches to
pose-invariant face recognition either perform face frontalization on, or learn
a pose-invariant representation from, a non-frontal face image. We argue that
it is more desirable to perform both tasks jointly to allow them to leverage
each other. To this end, this paper proposes a Disentangled Representation
learning-Generative Adversarial Network (DR-GAN) with three distinct novelties.
First, the encoder-decoder structure of the generator enables DR-GAN to learn a
representation that is both generative and discriminative, which can be used
for face image synthesis and pose-invariant face recognition. Second, this
representation is explicitly disentangled from other face variations such as
pose, through the pose code provided to the decoder and pose estimation in the
discriminator. Third, DR-GAN can take one or multiple images as the input, and
generate one unified identity representation along with an arbitrary number of
synthetic face images. Extensive quantitative and qualitative evaluation on a
number of controlled and in-the-wild databases demonstrate the superiority of
DR-GAN over the state of the art in both learning representations and rotating
large-pose face images.Comment: IEEE Transactions on Pattern Analysis and Machine Intelligence
(TPAMI
Face Recognition: From Traditional to Deep Learning Methods
Starting in the seventies, face recognition has become one of the most
researched topics in computer vision and biometrics. Traditional methods based
on hand-crafted features and traditional machine learning techniques have
recently been superseded by deep neural networks trained with very large
datasets. In this paper we provide a comprehensive and up-to-date literature
review of popular face recognition methods including both traditional
(geometry-based, holistic, feature-based and hybrid methods) and deep learning
methods
Noise-Tolerant Paradigm for Training Face Recognition CNNs
Benefit from large-scale training datasets, deep Convolutional Neural
Networks(CNNs) have achieved impressive results in face recognition(FR).
However, tremendous scale of datasets inevitably lead to noisy data, which
obviously reduce the performance of the trained CNN models. Kicking out wrong
labels from large-scale FR datasets is still very expensive, although some
cleaning approaches are proposed. According to the analysis of the whole
process of training CNN models supervised by angular margin based loss(AM-Loss)
functions, we find that the distribution of training samples
implicitly reflects their probability of being clean. Thus, we propose a novel
training paradigm that employs the idea of weighting samples based on the above
probability. Without any prior knowledge of noise, we can train high
performance CNN models with large-scale FR datasets. Experiments demonstrate
the effectiveness of our training paradigm. The codes are available at
https://github.com/huangyangyu/NoiseFace
- …