5 research outputs found

    Deep Heterogeneous Hashing for Face Video Retrieval

    Full text link
    Retrieving videos of a particular person with face image as a query via hashing technique has many important applications. While face images are typically represented as vectors in Euclidean space, characterizing face videos with some robust set modeling techniques (e.g. covariance matrices as exploited in this study, which reside on Riemannian manifold), has recently shown appealing advantages. This hence results in a thorny heterogeneous spaces matching problem. Moreover, hashing with handcrafted features as done in many existing works is clearly inadequate to achieve desirable performance for this task. To address such problems, we present an end-to-end Deep Heterogeneous Hashing (DHH) method that integrates three stages including image feature learning, video modeling, and heterogeneous hashing in a single framework, to learn unified binary codes for both face images and videos. To tackle the key challenge of hashing on the manifold, a well-studied Riemannian kernel mapping is employed to project data (i.e. covariance matrices) into Euclidean space and thus enables to embed the two heterogeneous representations into a common Hamming space, where both intra-space discriminability and inter-space compatibility are considered. To perform network optimization, the gradient of the kernel mapping is innovatively derived via structured matrix backpropagation in a theoretically principled way. Experiments on three challenging datasets show that our method achieves quite competitive performance compared with existing hashing methods.Comment: 14 pages, 17 figures, 4 tables, accepted by IEEE Transactions on Image Processing (TIP) 201

    Face video retrieval with image query via hashing across Euclidean space and Riemannian manifold

    No full text
    Retrieving videos of a specific person given his/her face image as query becomes more and more appealing for applications like smart movie fast-forwards and suspec-t searching. It also forms an interesting but challenging computer vision task, as the visual data to match, i.e., still image and video clip are usually represented quite different-ly. Typically, face image is represented as point (i.e., vector) in Euclidean space, while video clip is seemingly modeled as a point (e.g., covariance matrix) on some particular Rie-mannian manifold in the light of its recent promising suc-cess. It thus incurs a new hashing-based retrieval problem of matching two heterogeneous representations, respective-ly in Euclidean space and Riemannian manifold. This work makes the first attempt to embed the two heterogeneous s-paces into a common discriminant Hamming space. Specifi-cally, we propose Hashing across Euclidean space and Rie-mannian manifold (HER) by deriving a unified framework to firstly embed the two spaces into corresponding repro-ducing kernel Hilbert spaces, and then iteratively optimize the intra- and inter-space Hamming distances in a max-margin framework to learn the hash functions for the two spaces. Extensive experiments demonstrate the impressive superiority of our method over the state-of-the-art competi-tive hash learning methods. 1
    corecore