1,747 research outputs found
Composite Correlation Quantization for Efficient Multimodal Retrieval
Efficient similarity retrieval from large-scale multimodal database is
pervasive in modern search engines and social networks. To support queries
across content modalities, the system should enable cross-modal correlation and
computation-efficient indexing. While hashing methods have shown great
potential in achieving this goal, current attempts generally fail to learn
isomorphic hash codes in a seamless scheme, that is, they embed multiple
modalities in a continuous isomorphic space and separately threshold embeddings
into binary codes, which incurs substantial loss of retrieval accuracy. In this
paper, we approach seamless multimodal hashing by proposing a novel Composite
Correlation Quantization (CCQ) model. Specifically, CCQ jointly finds
correlation-maximal mappings that transform different modalities into
isomorphic latent space, and learns composite quantizers that convert the
isomorphic latent features into compact binary codes. An optimization framework
is devised to preserve both intra-modal similarity and inter-modal correlation
through minimizing both reconstruction and quantization errors, which can be
trained from both paired and partially paired data in linear time. A
comprehensive set of experiments clearly show the superior effectiveness and
efficiency of CCQ against the state of the art hashing methods for both
unimodal and cross-modal retrieval
Ranking-based Deep Cross-modal Hashing
Cross-modal hashing has been receiving increasing interests for its low
storage cost and fast query speed in multi-modal data retrievals. However, most
existing hashing methods are based on hand-crafted or raw level features of
objects, which may not be optimally compatible with the coding process.
Besides, these hashing methods are mainly designed to handle simple pairwise
similarity. The complex multilevel ranking semantic structure of instances
associated with multiple labels has not been well explored yet. In this paper,
we propose a ranking-based deep cross-modal hashing approach (RDCMH). RDCMH
firstly uses the feature and label information of data to derive a
semi-supervised semantic ranking list. Next, to expand the semantic
representation power of hand-crafted features, RDCMH integrates the semantic
ranking information into deep cross-modal hashing and jointly optimizes the
compatible parameters of deep feature representations and of hashing functions.
Experiments on real multi-modal datasets show that RDCMH outperforms other
competitive baselines and achieves the state-of-the-art performance in
cross-modal retrieval applications
Deep Lifelong Cross-modal Hashing
Hashing methods have made significant progress in cross-modal retrieval tasks
with fast query speed and low storage cost. Among them, deep learning-based
hashing achieves better performance on large-scale data due to its excellent
extraction and representation ability for nonlinear heterogeneous features.
However, there are still two main challenges in catastrophic forgetting when
data with new categories arrive continuously, and time-consuming for
non-continuous hashing retrieval to retrain for updating. To this end, we, in
this paper, propose a novel deep lifelong cross-modal hashing to achieve
lifelong hashing retrieval instead of re-training hash function repeatedly when
new data arrive. Specifically, we design lifelong learning strategy to update
hash functions by directly training the incremental data instead of retraining
new hash functions using all the accumulated data, which significantly reduce
training time. Then, we propose lifelong hashing loss to enable original hash
codes participate in lifelong learning but remain invariant, and further
preserve the similarity and dis-similarity among original and incremental hash
codes to maintain performance. Additionally, considering distribution
heterogeneity when new data arriving continuously, we introduce multi-label
semantic similarity to supervise hash learning, and it has been proven that the
similarity improves performance with detailed analysis. Experimental results on
benchmark datasets show that the proposed methods achieves comparative
performance comparing with recent state-of-the-art cross-modal hashing methods,
and it yields substantial average increments over 20\% in retrieval accuracy
and almost reduces over 80\% training time when new data arrives continuously
- …