19,411 research outputs found
FACE RECOGNITION AND VERIFICATION IN UNCONSTRAINED ENVIRIONMENTS
Face recognition has been a long standing problem in computer vision. General
face recognition is challenging because of large appearance variability due to
factors including pose, ambient lighting, expression, size of the face, age, and distance
from the camera, etc. There are very accurate techniques to perform face
recognition in controlled environments, especially when large numbers of samples
are available for each face (individual). However, face identification under uncontrolled(
unconstrained) environments or with limited training data is still an unsolved
problem. There are two face recognition tasks: face identification (who is who in
a probe face set, given a gallery face set) and face verification (same or not, given
two faces). In this work, we study both face identification and verification in unconstrained
environments.
Firstly, we propose a face verification framework that combines Partial Least
Squares (PLS) and the One-Shot similarity model[1]. The idea is to describe a
face with a large feature set combining shape, texture and color information. PLS
regression is applied to perform multi-channel feature weighting on this large feature
set. Finally the PLS regression is used to compute the similarity score of an image
pair by One-Shot learning (using a fixed negative set).
Secondly, we study face identification with image sets, where the gallery and
probe are sets of face images of an individual. We model a face set by its covariance
matrix (COV) which is a natural 2nd-order statistic of a sample set.By exploring an
efficient metric for the SPD matrices, i.e., Log-Euclidean Distance (LED), we derive
a kernel function that explicitly maps the covariance matrix from the Riemannian
manifold to Euclidean space. Then, discriminative learning is performed on the
COV manifold: the learning aims to maximize the between-class COV distance and
minimize the within-class COV distance.
Sparse representation and dictionary learning have been widely used in face
recognition, especially when large numbers of samples are available for each face
(individual). Sparse coding is promising since it provides a more stable and discriminative
face representation. In the last part of our work, we explore sparse
coding and dictionary learning for face verification application. More specifically,
in one approach, we apply sparse representations to face verification in two ways
via a fix reference set as dictionary. In the other approach, we propose a dictionary
learning framework with explicit pairwise constraints, which unifies the discriminative
dictionary learning for pair matching (face verification) and classification (face
recognition) problems
A Discriminatively Learned CNN Embedding for Person Re-identification
We revisit two popular convolutional neural networks (CNN) in person
re-identification (re-ID), i.e, verification and classification models. The two
models have their respective advantages and limitations due to different loss
functions. In this paper, we shed light on how to combine the two models to
learn more discriminative pedestrian descriptors. Specifically, we propose a
new siamese network that simultaneously computes identification loss and
verification loss. Given a pair of training images, the network predicts the
identities of the two images and whether they belong to the same identity. Our
network learns a discriminative embedding and a similarity measurement at the
same time, thus making full usage of the annotations. Albeit simple, the
learned embedding improves the state-of-the-art performance on two public
person re-ID benchmarks. Further, we show our architecture can also be applied
in image retrieval
Comparator Networks
The objective of this work is set-based verification, e.g. to decide if two
sets of images of a face are of the same person or not. The traditional
approach to this problem is to learn to generate a feature vector per image,
aggregate them into one vector to represent the set, and then compute the
cosine similarity between sets. Instead, we design a neural network
architecture that can directly learn set-wise verification. Our contributions
are: (i) We propose a Deep Comparator Network (DCN) that can ingest a pair of
sets (each may contain a variable number of images) as inputs, and compute a
similarity between the pair--this involves attending to multiple discriminative
local regions (landmarks), and comparing local descriptors between pairs of
faces; (ii) To encourage high-quality representations for each set, internal
competition is introduced for recalibration based on the landmark score; (iii)
Inspired by image retrieval, a novel hard sample mining regime is proposed to
control the sampling process, such that the DCN is complementary to the
standard image classification models. Evaluations on the IARPA Janus face
recognition benchmarks show that the comparator networks outperform the
previous state-of-the-art results by a large margin.Comment: To appear in ECCV 201
Template Adaptation for Face Verification and Identification
Face recognition performance evaluation has traditionally focused on
one-to-one verification, popularized by the Labeled Faces in the Wild dataset
for imagery and the YouTubeFaces dataset for videos. In contrast, the newly
released IJB-A face recognition dataset unifies evaluation of one-to-many face
identification with one-to-one face verification over templates, or sets of
imagery and videos for a subject. In this paper, we study the problem of
template adaptation, a form of transfer learning to the set of media in a
template. Extensive performance evaluations on IJB-A show a surprising result,
that perhaps the simplest method of template adaptation, combining deep
convolutional network features with template specific linear SVMs, outperforms
the state-of-the-art by a wide margin. We study the effects of template size,
negative set construction and classifier fusion on performance, then compare
template adaptation to convolutional networks with metric learning, 2D and 3D
alignment. Our unexpected conclusion is that these other methods, when combined
with template adaptation, all achieve nearly the same top performance on IJB-A
for template-based face verification and identification
A Proximity-Aware Hierarchical Clustering of Faces
In this paper, we propose an unsupervised face clustering algorithm called
"Proximity-Aware Hierarchical Clustering" (PAHC) that exploits the local
structure of deep representations. In the proposed method, a similarity measure
between deep features is computed by evaluating linear SVM margins. SVMs are
trained using nearest neighbors of sample data, and thus do not require any
external training data. Clusters are then formed by thresholding the similarity
scores. We evaluate the clustering performance using three challenging
unconstrained face datasets, including Celebrity in Frontal-Profile (CFP),
IARPA JANUS Benchmark A (IJB-A), and JANUS Challenge Set 3 (JANUS CS3)
datasets. Experimental results demonstrate that the proposed approach can
achieve significant improvements over state-of-the-art methods. Moreover, we
also show that the proposed clustering algorithm can be applied to curate a set
of large-scale and noisy training dataset while maintaining sufficient amount
of images and their variations due to nuisance factors. The face verification
performance on JANUS CS3 improves significantly by finetuning a DCNN model with
the curated MS-Celeb-1M dataset which contains over three million face images
Face Identification and Clustering
In this thesis, we study two problems based on clustering algorithms. In the
first problem, we study the role of visual attributes using an agglomerative
clustering algorithm to whittle down the search area where the number of
classes is high to improve the performance of clustering. We observe that as we
add more attributes, the clustering performance increases overall. In the
second problem, we study the role of clustering in aggregating templates in a
1:N open set protocol using multi-shot video as a probe. We observe that by
increasing the number of clusters, the performance increases with respect to
the baseline and reaches a peak, after which increasing the number of clusters
causes the performance to degrade. Experiments are conducted using recently
introduced unconstrained IARPA Janus IJB-A, CS2, and CS3 face recognition
datasets
Multi-shot Pedestrian Re-identification via Sequential Decision Making
Multi-shot pedestrian re-identification problem is at the core of
surveillance video analysis. It matches two tracks of pedestrians from
different cameras. In contrary to existing works that aggregate single frames
features by time series model such as recurrent neural network, in this paper,
we propose an interpretable reinforcement learning based approach to this
problem. Particularly, we train an agent to verify a pair of images at each
time. The agent could choose to output the result (same or different) or
request another pair of images to verify (unsure). By this way, our model
implicitly learns the difficulty of image pairs, and postpone the decision when
the model does not accumulate enough evidence. Moreover, by adjusting the
reward for unsure action, we can easily trade off between speed and accuracy.
In three open benchmarks, our method are competitive with the state-of-the-art
methods while only using 3% to 6% images. These promising results demonstrate
that our method is favorable in both efficiency and performance
- …