654 research outputs found
Cycle-Consistent Deep Generative Hashing for Cross-Modal Retrieval
In this paper, we propose a novel deep generative approach to cross-modal
retrieval to learn hash functions in the absence of paired training samples
through the cycle consistency loss. Our proposed approach employs adversarial
training scheme to lean a couple of hash functions enabling translation between
modalities while assuming the underlying semantic relationship. To induce the
hash codes with semantics to the input-output pair, cycle consistency loss is
further proposed upon the adversarial training to strengthen the correlations
between inputs and corresponding outputs. Our approach is generative to learn
hash functions such that the learned hash codes can maximally correlate each
input-output correspondence, meanwhile can also regenerate the inputs so as to
minimize the information loss. The learning to hash embedding is thus performed
to jointly optimize the parameters of the hash functions across modalities as
well as the associated generative models. Extensive experiments on a variety of
large-scale cross-modal data sets demonstrate that our proposed method achieves
better retrieval results than the state-of-the-arts.Comment: To appeared on IEEE Trans. Image Processing. arXiv admin note: text
overlap with arXiv:1703.10593 by other author
Zero-Shot Hashing via Transferring Supervised Knowledge
Hashing has shown its efficiency and effectiveness in facilitating
large-scale multimedia applications. Supervised knowledge e.g. semantic labels
or pair-wise relationship) associated to data is capable of significantly
improving the quality of hash codes and hash functions. However, confronted
with the rapid growth of newly-emerging concepts and multimedia data on the
Web, existing supervised hashing approaches may easily suffer from the scarcity
and validity of supervised information due to the expensive cost of manual
labelling. In this paper, we propose a novel hashing scheme, termed
\emph{zero-shot hashing} (ZSH), which compresses images of "unseen" categories
to binary codes with hash functions learned from limited training data of
"seen" categories. Specifically, we project independent data labels i.e.
0/1-form label vectors) into semantic embedding space, where semantic
relationships among all the labels can be precisely characterized and thus seen
supervised knowledge can be transferred to unseen classes. Moreover, in order
to cope with the semantic shift problem, we rotate the embedded space to more
suitably align the embedded semantics with the low-level visual feature space,
thereby alleviating the influence of semantic gap. In the meantime, to exert
positive effects on learning high-quality hash functions, we further propose to
preserve local structural property and discrete nature in binary codes.
Besides, we develop an efficient alternating algorithm to solve the ZSH model.
Extensive experiments conducted on various real-life datasets show the superior
zero-shot image retrieval performance of ZSH as compared to several
state-of-the-art hashing methods.Comment: 11 page
Deep Lifelong Cross-modal Hashing
Hashing methods have made significant progress in cross-modal retrieval tasks
with fast query speed and low storage cost. Among them, deep learning-based
hashing achieves better performance on large-scale data due to its excellent
extraction and representation ability for nonlinear heterogeneous features.
However, there are still two main challenges in catastrophic forgetting when
data with new categories arrive continuously, and time-consuming for
non-continuous hashing retrieval to retrain for updating. To this end, we, in
this paper, propose a novel deep lifelong cross-modal hashing to achieve
lifelong hashing retrieval instead of re-training hash function repeatedly when
new data arrive. Specifically, we design lifelong learning strategy to update
hash functions by directly training the incremental data instead of retraining
new hash functions using all the accumulated data, which significantly reduce
training time. Then, we propose lifelong hashing loss to enable original hash
codes participate in lifelong learning but remain invariant, and further
preserve the similarity and dis-similarity among original and incremental hash
codes to maintain performance. Additionally, considering distribution
heterogeneity when new data arriving continuously, we introduce multi-label
semantic similarity to supervise hash learning, and it has been proven that the
similarity improves performance with detailed analysis. Experimental results on
benchmark datasets show that the proposed methods achieves comparative
performance comparing with recent state-of-the-art cross-modal hashing methods,
and it yields substantial average increments over 20\% in retrieval accuracy
and almost reduces over 80\% training time when new data arrives continuously
- …