11,995 research outputs found
Unsupervised Generative Adversarial Cross-modal Hashing
Cross-modal hashing aims to map heterogeneous multimedia data into a common
Hamming space, which can realize fast and flexible retrieval across different
modalities. Unsupervised cross-modal hashing is more flexible and applicable
than supervised methods, since no intensive labeling work is involved. However,
existing unsupervised methods learn hashing functions by preserving inter and
intra correlations, while ignoring the underlying manifold structure across
different modalities, which is extremely helpful to capture meaningful nearest
neighbors of different modalities for cross-modal retrieval. To address the
above problem, in this paper we propose an Unsupervised Generative Adversarial
Cross-modal Hashing approach (UGACH), which makes full use of GAN's ability for
unsupervised representation learning to exploit the underlying manifold
structure of cross-modal data. The main contributions can be summarized as
follows: (1) We propose a generative adversarial network to model cross-modal
hashing in an unsupervised fashion. In the proposed UGACH, given a data of one
modality, the generative model tries to fit the distribution over the manifold
structure, and select informative data of another modality to challenge the
discriminative model. The discriminative model learns to distinguish the
generated data and the true positive data sampled from correlation graph to
achieve better retrieval accuracy. These two models are trained in an
adversarial way to improve each other and promote hashing function learning.
(2) We propose a correlation graph based approach to capture the underlying
manifold structure across different modalities, so that data of different
modalities but within the same manifold can have smaller Hamming distance and
promote retrieval accuracy. Extensive experiments compared with 6
state-of-the-art methods verify the effectiveness of our proposed approach.Comment: 8 pages, accepted by 32th AAAI Conference on Artificial Intelligence
(AAAI), 201
The relationship between IR and multimedia databases
Modern extensible database systems support multimedia data through ADTs. However, because of the problems with multimedia query formulation, this support is not sufficient.\ud
\ud
Multimedia querying requires an iterative search process involving many different representations of the objects in the database. The support that is needed is very similar to the processes in information retrieval.\ud
\ud
Based on this observation, we develop the miRRor architecture for multimedia query processing. We design a layered framework based on information retrieval techniques, to provide a usable query interface to the multimedia database.\ud
\ud
First, we introduce a concept layer to enable reasoning over low-level concepts in the database.\ud
\ud
Second, we add an evidential reasoning layer as an intermediate between the user and the concept layer.\ud
\ud
Third, we add the functionality to process the users' relevance feedback.\ud
\ud
We then adapt the inference network model from text retrieval to an evidential reasoning model for multimedia query processing.\ud
\ud
We conclude with an outline for implementation of miRRor on top of the Monet extensible database system
Unsupervised Triplet Hashing for Fast Image Retrieval
Hashing has played a pivotal role in large-scale image retrieval. With the
development of Convolutional Neural Network (CNN), hashing learning has shown
great promise. But existing methods are mostly tuned for classification, which
are not optimized for retrieval tasks, especially for instance-level retrieval.
In this study, we propose a novel hashing method for large-scale image
retrieval. Considering the difficulty in obtaining labeled datasets for image
retrieval task in large scale, we propose a novel CNN-based unsupervised
hashing method, namely Unsupervised Triplet Hashing (UTH). The unsupervised
hashing network is designed under the following three principles: 1) more
discriminative representations for image retrieval; 2) minimum quantization
loss between the original real-valued feature descriptors and the learned hash
codes; 3) maximum information entropy for the learned hash codes. Extensive
experiments on CIFAR-10, MNIST and In-shop datasets have shown that UTH
outperforms several state-of-the-art unsupervised hashing methods in terms of
retrieval accuracy
- …