27,300 research outputs found
Neural Class-Specific Regression for face verification
Face verification is a problem approached in the literature mainly using
nonlinear class-specific subspace learning techniques. While it has been shown
that kernel-based Class-Specific Discriminant Analysis is able to provide
excellent performance in small- and medium-scale face verification problems,
its application in today's large-scale problems is difficult due to its
training space and computational requirements. In this paper, generalizing our
previous work on kernel-based class-specific discriminant analysis, we show
that class-specific subspace learning can be cast as a regression problem. This
allows us to derive linear, (reduced) kernel and neural network-based
class-specific discriminant analysis methods using efficient batch and/or
iterative training schemes, suited for large-scale learning problems. We test
the performance of these methods in two datasets describing medium- and
large-scale face verification problems.Comment: 9 pages, 4 figure
Toward Open-Set Face Recognition
Much research has been conducted on both face identification and face
verification, with greater focus on the latter. Research on face identification
has mostly focused on using closed-set protocols, which assume that all probe
images used in evaluation contain identities of subjects that are enrolled in
the gallery. Real systems, however, where only a fraction of probe sample
identities are enrolled in the gallery, cannot make this closed-set assumption.
Instead, they must assume an open set of probe samples and be able to
reject/ignore those that correspond to unknown identities. In this paper, we
address the widespread misconception that thresholding verification-like scores
is a good way to solve the open-set face identification problem, by formulating
an open-set face identification protocol and evaluating different strategies
for assessing similarity. Our open-set identification protocol is based on the
canonical labeled faces in the wild (LFW) dataset. Additionally to the known
identities, we introduce the concepts of known unknowns (known, but
uninteresting persons) and unknown unknowns (people never seen before) to the
biometric community. We compare three algorithms for assessing similarity in a
deep feature space under an open-set protocol: thresholded verification-like
scores, linear discriminant analysis (LDA) scores, and an extreme value machine
(EVM) probabilities. Our findings suggest that thresholding EVM probabilities,
which are open-set by design, outperforms thresholding verification-like
scores.Comment: Accepted for Publication in CVPR 2017 Biometrics Worksho
Kinship Verification from Videos using Spatio-Temporal Texture Features and Deep Learning
Automatic kinship verification using facial images is a relatively new and
challenging research problem in computer vision. It consists in automatically
predicting whether two persons have a biological kin relation by examining
their facial attributes. While most of the existing works extract shallow
handcrafted features from still face images, we approach this problem from
spatio-temporal point of view and explore the use of both shallow texture
features and deep features for characterizing faces. Promising results,
especially those of deep features, are obtained on the benchmark UvA-NEMO Smile
database. Our extensive experiments also show the superiority of using videos
over still images, hence pointing out the important role of facial dynamics in
kinship verification. Furthermore, the fusion of the two types of features
(i.e. shallow spatio-temporal texture features and deep features) shows
significant performance improvements compared to state-of-the-art methods.Comment: 7 page
VoxCeleb2: Deep Speaker Recognition
The objective of this paper is speaker recognition under noisy and
unconstrained conditions.
We make two key contributions. First, we introduce a very large-scale
audio-visual speaker recognition dataset collected from open-source media.
Using a fully automated pipeline, we curate VoxCeleb2 which contains over a
million utterances from over 6,000 speakers. This is several times larger than
any publicly available speaker recognition dataset.
Second, we develop and compare Convolutional Neural Network (CNN) models and
training strategies that can effectively recognise identities from voice under
various conditions. The models trained on the VoxCeleb2 dataset surpass the
performance of previous works on a benchmark dataset by a significant margin.Comment: To appear in Interspeech 2018. The audio-visual dataset can be
downloaded from http://www.robots.ox.ac.uk/~vgg/data/voxceleb2 .
1806.05622v2: minor fixes; 5 page
- …