15,034 research outputs found
Deep Learning Face Representation by Joint Identification-Verification
The key challenge of face recognition is to develop effective feature
representations for reducing intra-personal variations while enlarging
inter-personal differences. In this paper, we show that it can be well solved
with deep learning and using both face identification and verification signals
as supervision. The Deep IDentification-verification features (DeepID2) are
learned with carefully designed deep convolutional networks. The face
identification task increases the inter-personal variations by drawing DeepID2
extracted from different identities apart, while the face verification task
reduces the intra-personal variations by pulling DeepID2 extracted from the
same identity together, both of which are essential to face recognition. The
learned DeepID2 features can be well generalized to new identities unseen in
the training data. On the challenging LFW dataset, 99.15% face verification
accuracy is achieved. Compared with the best deep learning result on LFW, the
error rate has been significantly reduced by 67%
Unconstrained Face Verification using Deep CNN Features
In this paper, we present an algorithm for unconstrained face verification
based on deep convolutional features and evaluate it on the newly released
IARPA Janus Benchmark A (IJB-A) dataset. The IJB-A dataset includes real-world
unconstrained faces from 500 subjects with full pose and illumination
variations which are much harder than the traditional Labeled Face in the Wild
(LFW) and Youtube Face (YTF) datasets. The deep convolutional neural network
(DCNN) is trained using the CASIA-WebFace dataset. Extensive experiments on the
IJB-A dataset are provided
Joint Bayesian Gaussian discriminant analysis for speaker verification
State-of-the-art i-vector based speaker verification relies on variants of
Probabilistic Linear Discriminant Analysis (PLDA) for discriminant analysis. We
are mainly motivated by the recent work of the joint Bayesian (JB) method,
which is originally proposed for discriminant analysis in face verification. We
apply JB to speaker verification and make three contributions beyond the
original JB. 1) In contrast to the EM iterations with approximated statistics
in the original JB, the EM iterations with exact statistics are employed and
give better performance. 2) We propose to do simultaneous diagonalization (SD)
of the within-class and between-class covariance matrices to achieve efficient
testing, which has broader application scope than the SVD-based efficient
testing method in the original JB. 3) We scrutinize similarities and
differences between various Gaussian PLDAs and JB, complementing the previous
analysis of comparing JB only with Prince-Elder PLDA. Extensive experiments are
conducted on NIST SRE10 core condition 5, empirically validating the
superiority of JB with faster convergence rate and 9-13% EER reduction compared
with state-of-the-art PLDA.Comment: accepted by ICASSP201
Deep learning architectures for Computer Vision
Deep learning has become part of many state-of-the-art systems in multiple disciplines (specially in computer vision and speech processing). In this thesis Convolutional Neural Networks are used to solve the problem of recognizing people in images, both for verification and identification. Two different architectures, AlexNet and VGG19, both winners of the ILSVRC, have been fine-tuned and tested with four datasets: Labeled Faces in the Wild, FaceScrub, YouTubeFaces and Google UPC, a dataset generated at the UPC. Finally, with the features extracted from these fine-tuned networks, some verifications algorithms have been tested including Support Vector Machines, Joint Bayesian and Advanced Joint Bayesian formulation. The results of this work show that an Area Under the Receiver Operating Characteristic curve of 99.6% can be obtained, close to the state-of-the-art performance.El aprendizaje profundo se ha convertido en parte de muchos sistemas en el estado del arte de múltiples ámbitos (especialmente en visión por computador y procesamiento de voz). En esta tesis se utilizan las Redes Neuronales Convolucionales para resolver el problema de reconocer a personas en imágenes, tanto para verificación como para identificación. Dos arquitecturas diferentes, AlexNet y VGG19, ambas ganadores del ILSVRC, han sido afinadas y probadas con cuatro conjuntos de datos: Labeled Faces in the Wild, FaceScrub, YouTubeFaces y Google UPC, un conjunto generado en la UPC. Finalmente con las características extraídas de las redes afinadas, se han probado diferentes algoritmos de verificación, incluyendo Maquinas de Soporte Vectorial, Joint Bayesian y Advanced Joint Bayesian. Los resultados de este trabajo muestran que el Área Bajo la Curva de la Característica Operativa del Receptor puede llegar a ser del 99.6%, cercana al valor del estado del arte.L’aprenentatge profund s’ha convertit en una part importat de molts sistemes a l’estat de
l’art de múltiples àmbits (especialment de la visió per computador i el processament de
veu). A aquesta tesi s’utilitzen les Xarxes Neuronals Convolucionals per a resoldre el
problema de reconèixer persones a imatges, tant per verificació com per identificatió.
Dos arquitectures diferents, AlexNet i VGG19, les dues guanyadores del ILSVRC, han
sigut afinades i provades amb quatre bases de dades: Labeled Faces in the Wild,
FaceScrub, YouTubeFaces i Google UPC, un conjunt generat a la UPC.
Finalment, amb les característiques extretes de les xarxes afinades, s’han provat diferents
algoritmes de verificació, incloent Màquines de Suport Vectorial, Joint Bayesian i Advanced
Joint Bayesian. Els resultats d’aquest treball mostres que un Àrea Baix la Curva de la
Característica Operativa del Receptor por arribar a ser del 99.6%, propera al valor de l’estat
de l’art
When Face Recognition Meets with Deep Learning: an Evaluation of Convolutional Neural Networks for Face Recognition
Deep learning, in particular Convolutional Neural Network (CNN), has achieved
promising results in face recognition recently. However, it remains an open
question: why CNNs work well and how to design a 'good' architecture. The
existing works tend to focus on reporting CNN architectures that work well for
face recognition rather than investigate the reason. In this work, we conduct
an extensive evaluation of CNN-based face recognition systems (CNN-FRS) on a
common ground to make our work easily reproducible. Specifically, we use public
database LFW (Labeled Faces in the Wild) to train CNNs, unlike most existing
CNNs trained on private databases. We propose three CNN architectures which are
the first reported architectures trained using LFW data. This paper
quantitatively compares the architectures of CNNs and evaluate the effect of
different implementation choices. We identify several useful properties of
CNN-FRS. For instance, the dimensionality of the learned features can be
significantly reduced without adverse effect on face recognition accuracy. In
addition, traditional metric learning method exploiting CNN-learned features is
evaluated. Experiments show two crucial factors to good CNN-FRS performance are
the fusion of multiple CNNs and metric learning. To make our work reproducible,
source code and models will be made publicly available.Comment: 7 pages, 4 figures, 7 table
- …