2,338 research outputs found
Ranking-based Deep Cross-modal Hashing
Cross-modal hashing has been receiving increasing interests for its low
storage cost and fast query speed in multi-modal data retrievals. However, most
existing hashing methods are based on hand-crafted or raw level features of
objects, which may not be optimally compatible with the coding process.
Besides, these hashing methods are mainly designed to handle simple pairwise
similarity. The complex multilevel ranking semantic structure of instances
associated with multiple labels has not been well explored yet. In this paper,
we propose a ranking-based deep cross-modal hashing approach (RDCMH). RDCMH
firstly uses the feature and label information of data to derive a
semi-supervised semantic ranking list. Next, to expand the semantic
representation power of hand-crafted features, RDCMH integrates the semantic
ranking information into deep cross-modal hashing and jointly optimizes the
compatible parameters of deep feature representations and of hashing functions.
Experiments on real multi-modal datasets show that RDCMH outperforms other
competitive baselines and achieves the state-of-the-art performance in
cross-modal retrieval applications
Deep Binary Reconstruction for Cross-modal Hashing
With the increasing demand of massive multimodal data storage and
organization, cross-modal retrieval based on hashing technique has drawn much
attention nowadays. It takes the binary codes of one modality as the query to
retrieve the relevant hashing codes of another modality. However, the existing
binary constraint makes it difficult to find the optimal cross-modal hashing
function. Most approaches choose to relax the constraint and perform
thresholding strategy on the real-value representation instead of directly
solving the original objective. In this paper, we first provide a concrete
analysis about the effectiveness of multimodal networks in preserving the
inter- and intra-modal consistency. Based on the analysis, we provide a
so-called Deep Binary Reconstruction (DBRC) network that can directly learn the
binary hashing codes in an unsupervised fashion. The superiority comes from a
proposed simple but efficient activation function, named as Adaptive Tanh
(ATanh). The ATanh function can adaptively learn the binary codes and be
trained via back-propagation. Extensive experiments on three benchmark datasets
demonstrate that DBRC outperforms several state-of-the-art methods in both
image2text and text2image retrieval task.Comment: 8 pages, 5 figures, accepted by ACM Multimedia 201
- …