2 research outputs found
Deep Cross-Modal Hashing with Hashing Functions and Unified Hash Codes Jointly Learning
Due to their high retrieval efficiency and low storage cost, cross-modal
hashing methods have attracted considerable attention. Generally, compared with
shallow cross-modal hashing methods, deep cross-modal hashing methods can
achieve a more satisfactory performance by integrating feature learning and
hash codes optimizing into a same framework. However, most existing deep
cross-modal hashing methods either cannot learn a unified hash code for the two
correlated data-points of different modalities in a database instance or cannot
guide the learning of unified hash codes by the feedback of hashing function
learning procedure, to enhance the retrieval accuracy. To address the issues
above, in this paper, we propose a novel end-to-end Deep Cross-Modal Hashing
with Hashing Functions and Unified Hash Codes Jointly Learning (DCHUC).
Specifically, by an iterative optimization algorithm, DCHUC jointly learns
unified hash codes for image-text pairs in a database and a pair of hash
functions for unseen query image-text pairs. With the iterative optimization
algorithm, the learned unified hash codes can be used to guide the hashing
function learning procedure; Meanwhile, the learned hashing functions can
feedback to guide the unified hash codes optimizing procedure. Extensive
experiments on three public datasets demonstrate that the proposed method
outperforms the state-of-the-art cross-modal hashing methods
Survey on Deep Multi-modal Data Analytics: Collaboration, Rivalry and Fusion
With the development of web technology, multi-modal or multi-view data has
surged as a major stream for big data, where each modal/view encodes individual
property of data objects. Often, different modalities are complementary to each
other. Such fact motivated a lot of research attention on fusing the
multi-modal feature spaces to comprehensively characterize the data objects.
Most of the existing state-of-the-art focused on how to fuse the energy or
information from multi-modal spaces to deliver a superior performance over
their counterparts with single modal. Recently, deep neural networks have
exhibited as a powerful architecture to well capture the nonlinear distribution
of high-dimensional multimedia data, so naturally does for multi-modal data.
Substantial empirical studies are carried out to demonstrate its advantages
that are benefited from deep multi-modal methods, which can essentially deepen
the fusion from multi-modal deep feature spaces. In this paper, we provide a
substantial overview of the existing state-of-the-arts on the filed of
multi-modal data analytics from shallow to deep spaces. Throughout this survey,
we further indicate that the critical components for this field go to
collaboration, adversarial competition and fusion over multi-modal spaces.
Finally, we share our viewpoints regarding some future directions on this
field.Comment: Appearing at ACM TOMM, 26 page