3 research outputs found
Deep Heterogeneous Hashing for Face Video Retrieval
Retrieving videos of a particular person with face image as a query via
hashing technique has many important applications. While face images are
typically represented as vectors in Euclidean space, characterizing face videos
with some robust set modeling techniques (e.g. covariance matrices as exploited
in this study, which reside on Riemannian manifold), has recently shown
appealing advantages. This hence results in a thorny heterogeneous spaces
matching problem. Moreover, hashing with handcrafted features as done in many
existing works is clearly inadequate to achieve desirable performance for this
task. To address such problems, we present an end-to-end Deep Heterogeneous
Hashing (DHH) method that integrates three stages including image feature
learning, video modeling, and heterogeneous hashing in a single framework, to
learn unified binary codes for both face images and videos. To tackle the key
challenge of hashing on the manifold, a well-studied Riemannian kernel mapping
is employed to project data (i.e. covariance matrices) into Euclidean space and
thus enables to embed the two heterogeneous representations into a common
Hamming space, where both intra-space discriminability and inter-space
compatibility are considered. To perform network optimization, the gradient of
the kernel mapping is innovatively derived via structured matrix
backpropagation in a theoretically principled way. Experiments on three
challenging datasets show that our method achieves quite competitive
performance compared with existing hashing methods.Comment: 14 pages, 17 figures, 4 tables, accepted by IEEE Transactions on
Image Processing (TIP) 201
Face Video Retrieval via Deep Learning of Binary Hash Representations
Retrieving faces from large mess of videos is an attractive research topic with wide range of applications. Its challenging problems are large intra-class variations, and tremendous time and space complexity. In this paper, we develop a new deep convolutional neural network (deep CNN) to learn discriminative and compact binary representations of faces for face video retrieval. The network integrates feature extraction and hash learning into a unified optimization framework for the optimal compatibility of feature extractor and hash functions. In order to better initialize the network, the low-rank discriminative binary hashing is proposed to pre-learn hash functions during the training procedure. Our method achieves excellent performances on two challenging TV-Series datasets