14 research outputs found
GhostVLAD for set-based face recognition
The objective of this paper is to learn a compact representation of image
sets for template-based face recognition. We make the following contributions:
first, we propose a network architecture which aggregates and embeds the face
descriptors produced by deep convolutional neural networks into a compact
fixed-length representation. This compact representation requires minimal
memory storage and enables efficient similarity computation. Second, we propose
a novel GhostVLAD layer that includes {\em ghost clusters}, that do not
contribute to the aggregation. We show that a quality weighting on the input
faces emerges automatically such that informative images contribute more than
those with low quality, and that the ghost clusters enhance the network's
ability to deal with poor quality images. Third, we explore how input feature
dimension, number of clusters and different training techniques affect the
recognition performance. Given this analysis, we train a network that far
exceeds the state-of-the-art on the IJB-B face recognition dataset. This is
currently one of the most challenging public benchmarks, and we surpass the
state-of-the-art on both the identification and verification protocols.Comment: Accepted by ACCV 201
Solving general elliptical mixture models through an approximate Wasserstein manifold
We address the estimation problem for general finite mixture models, with a
particular focus on the elliptical mixture models (EMMs). Compared to the
widely adopted Kullback-Leibler divergence, we show that the Wasserstein
distance provides a more desirable optimisation space. We thus provide a stable
solution to the EMMs that is both robust to initialisations and reaches a
superior optimum by adaptively optimising along a manifold of an approximate
Wasserstein distance. To this end, we first provide a unifying account of
computable and identifiable EMMs, which serves as a basis to rigorously address
the underpinning optimisation problem. Due to a probability constraint, solving
this problem is extremely cumbersome and unstable, especially under the
Wasserstein distance. To relieve this issue, we introduce an efficient
optimisation method on a statistical manifold defined under an approximate
Wasserstein distance, which allows for explicit metrics and computable
operations, thus significantly stabilising and improving the EMM estimation. We
further propose an adaptive method to accelerate the convergence. Experimental
results demonstrate the excellent performance of the proposed EMM solver.Comment: This work has been accepted to AAAI2020. Note that this version also
corrects a small error on the Equation (16) in proo
Deep Heterogeneous Hashing for Face Video Retrieval
Retrieving videos of a particular person with face image as a query via
hashing technique has many important applications. While face images are
typically represented as vectors in Euclidean space, characterizing face videos
with some robust set modeling techniques (e.g. covariance matrices as exploited
in this study, which reside on Riemannian manifold), has recently shown
appealing advantages. This hence results in a thorny heterogeneous spaces
matching problem. Moreover, hashing with handcrafted features as done in many
existing works is clearly inadequate to achieve desirable performance for this
task. To address such problems, we present an end-to-end Deep Heterogeneous
Hashing (DHH) method that integrates three stages including image feature
learning, video modeling, and heterogeneous hashing in a single framework, to
learn unified binary codes for both face images and videos. To tackle the key
challenge of hashing on the manifold, a well-studied Riemannian kernel mapping
is employed to project data (i.e. covariance matrices) into Euclidean space and
thus enables to embed the two heterogeneous representations into a common
Hamming space, where both intra-space discriminability and inter-space
compatibility are considered. To perform network optimization, the gradient of
the kernel mapping is innovatively derived via structured matrix
backpropagation in a theoretically principled way. Experiments on three
challenging datasets show that our method achieves quite competitive
performance compared with existing hashing methods.Comment: 14 pages, 17 figures, 4 tables, accepted by IEEE Transactions on
Image Processing (TIP) 201
Discriminant analysis on Riemannian manifold of Gaussian distributions for face recognition with image sets
This paper presents a method named Discriminant Anal-ysis on Riemannian manifold of Gaussian distributions (DARG) to solve the problem of face recognition with image sets. Our goal is to capture the underlying data distribution in each set and thus facilitate more robust classification. To this end, we represent image set as Gaussian Mixture Model (GMM) comprising a number of Gaussian components with prior probabilities and seek to discriminate Gaussian com-ponents from different classes. In the light of information geometry, the Gaussians lie on a specific Riemannian man-ifold. To encode such Riemannian geometry properly, we in-vestigate several distances between Gaussians and further derive a series of provably positive definite probabilistic k-ernels. Through these kernels, a weighted Kernel Discrim-inant Analysis is finally devised which treats the Gaussians in GMMs as samples and their prior probabilities as sam-ple weights. The proposed method is evaluated by face i-dentification and verification tasks on four most challenging and largest databases, YouTube Celebrities, COX, YouTube Face DB and Point-and-Shoot Challenge, to demonstrate it