447 research outputs found
SADIH: Semantic-Aware DIscrete Hashing
Due to its low storage cost and fast query speed, hashing has been recognized
to accomplish similarity search in large-scale multimedia retrieval
applications. Particularly supervised hashing has recently received
considerable research attention by leveraging the label information to preserve
the pairwise similarities of data points in the Hamming space. However, there
still remain two crucial bottlenecks: 1) the learning process of the full
pairwise similarity preservation is computationally unaffordable and unscalable
to deal with big data; 2) the available category information of data are not
well-explored to learn discriminative hash functions. To overcome these
challenges, we propose a unified Semantic-Aware DIscrete Hashing (SADIH)
framework, which aims to directly embed the transformed semantic information
into the asymmetric similarity approximation and discriminative hashing
function learning. Specifically, a semantic-aware latent embedding is
introduced to asymmetrically preserve the full pairwise similarities while
skillfully handle the cumbersome n times n pairwise similarity matrix.
Meanwhile, a semantic-aware autoencoder is developed to jointly preserve the
data structures in the discriminative latent semantic space and perform data
reconstruction. Moreover, an efficient alternating optimization algorithm is
proposed to solve the resulting discrete optimization problem. Extensive
experimental results on multiple large-scale datasets demonstrate that our
SADIH can clearly outperform the state-of-the-art baselines with the additional
benefit of lower computational costs.Comment: Accepted by The Thirty-Third AAAI Conference on Artificial
Intelligence (AAAI-19
Deep Hashing with Triplet Quantization Loss
With the explosive growth of image databases, deep hashing, which learns
compact binary descriptors for images, has become critical for fast image
retrieval. Many existing deep hashing methods leverage quantization loss,
defined as distance between the features before and after quantization, to
reduce the error from binarizing features. While minimizing the quantization
loss guarantees that quantization has minimal effect on retrieval accuracy, it
unfortunately significantly reduces the expressiveness of features even before
the quantization. In this paper, we show that the above definition of
quantization loss is too restricted and in fact not necessary for maintaining
high retrieval accuracy. We therefore propose a new form of quantization loss
measured in triplets. The core idea of the triplet quantization loss is to
learn discriminative real-valued descriptors which lead to minimal loss on
retrieval accuracy after quantization. Extensive experiments on two widely used
benchmark data sets of different scales, CIFAR-10 and In-shop, demonstrate that
the proposed method outperforms the state-of-the-art deep hashing methods.
Moreover, we show that the compact binary descriptors obtained with triplet
quantization loss lead to very small performance drop after quantization.Comment: 4 pages, to be presented at IEEE VCIP 201
Deep Spherical Quantization for Image Search
Hashing methods, which encode high-dimensional images with compact discrete
codes, have been widely applied to enhance large-scale image retrieval. In this
paper, we put forward Deep Spherical Quantization (DSQ), a novel method to make
deep convolutional neural networks generate supervised and compact binary codes
for efficient image search. Our approach simultaneously learns a mapping that
transforms the input images into a low-dimensional discriminative space, and
quantizes the transformed data points using multi-codebook quantization. To
eliminate the negative effect of norm variance on codebook learning, we force
the network to L_2 normalize the extracted features and then quantize the
resulting vectors using a new supervised quantization technique specifically
designed for points lying on a unit hypersphere. Furthermore, we introduce an
easy-to-implement extension of our quantization technique that enforces
sparsity on the codebooks. Extensive experiments demonstrate that DSQ and its
sparse variant can generate semantically separable compact binary codes
outperforming many state-of-the-art image retrieval methods on three
benchmarks
Deep Ordinal Hashing with Spatial Attention
Hashing has attracted increasing research attentions in recent years due to
its high efficiency of computation and storage in image retrieval. Recent works
have demonstrated the superiority of simultaneous feature representations and
hash functions learning with deep neural networks. However, most existing deep
hashing methods directly learn the hash functions by encoding the global
semantic information, while ignoring the local spatial information of images.
The loss of local spatial structure makes the performance bottleneck of hash
functions, therefore limiting its application for accurate similarity
retrieval. In this work, we propose a novel Deep Ordinal Hashing (DOH) method,
which learns ordinal representations by leveraging the ranking structure of
feature space from both local and global views. In particular, to effectively
build the ranking structure, we propose to learn the rank correlation space by
exploiting the local spatial information from Fully Convolutional Network (FCN)
and the global semantic information from the Convolutional Neural Network (CNN)
simultaneously. More specifically, an effective spatial attention model is
designed to capture the local spatial information by selectively learning
well-specified locations closely related to target objects. In such hashing
framework,the local spatial and global semantic nature of images are captured
in an end-to-end ranking-to-hashing manner. Experimental results conducted on
three widely-used datasets demonstrate that the proposed DOH method
significantly outperforms the state-of-the-art hashing methods
Unsupervised Triplet Hashing for Fast Image Retrieval
Hashing has played a pivotal role in large-scale image retrieval. With the
development of Convolutional Neural Network (CNN), hashing learning has shown
great promise. But existing methods are mostly tuned for classification, which
are not optimized for retrieval tasks, especially for instance-level retrieval.
In this study, we propose a novel hashing method for large-scale image
retrieval. Considering the difficulty in obtaining labeled datasets for image
retrieval task in large scale, we propose a novel CNN-based unsupervised
hashing method, namely Unsupervised Triplet Hashing (UTH). The unsupervised
hashing network is designed under the following three principles: 1) more
discriminative representations for image retrieval; 2) minimum quantization
loss between the original real-valued feature descriptors and the learned hash
codes; 3) maximum information entropy for the learned hash codes. Extensive
experiments on CIFAR-10, MNIST and In-shop datasets have shown that UTH
outperforms several state-of-the-art unsupervised hashing methods in terms of
retrieval accuracy
SCH-GAN: Semi-supervised Cross-modal Hashing by Generative Adversarial Network
Cross-modal hashing aims to map heterogeneous multimedia data into a common
Hamming space, which can realize fast and flexible retrieval across different
modalities. Supervised cross-modal hashing methods have achieved considerable
progress by incorporating semantic side information. However, they mainly have
two limitations: (1) Heavily rely on large-scale labeled cross-modal training
data which are labor intensive and hard to obtain. (2) Ignore the rich
information contained in the large amount of unlabeled data across different
modalities, especially the margin examples that are easily to be incorrectly
retrieved, which can help to model the correlations. To address these problems,
in this paper we propose a novel Semi-supervised Cross-Modal Hashing approach
by Generative Adversarial Network (SCH-GAN). We aim to take advantage of GAN's
ability for modeling data distributions to promote cross-modal hashing learning
in an adversarial way. The main contributions can be summarized as follows: (1)
We propose a novel generative adversarial network for cross-modal hashing. In
our proposed SCH-GAN, the generative model tries to select margin examples of
one modality from unlabeled data when giving a query of another modality. While
the discriminative model tries to distinguish the selected examples and true
positive examples of the query. These two models play a minimax game so that
the generative model can promote the hashing performance of discriminative
model. (2) We propose a reinforcement learning based algorithm to drive the
training of proposed SCH-GAN. The generative model takes the correlation score
predicted by discriminative model as a reward, and tries to select the examples
close to the margin to promote discriminative model by maximizing the margin
between positive and negative data. Experiments on 3 widely-used datasets
verify the effectiveness of our proposed approach.Comment: 12 pages, submitted to IEEE Transactions on Cybernetic
Evaluation of Hashing Methods Performance on Binary Feature Descriptors
In this paper we evaluate performance of data-dependent hashing methods on
binary data. The goal is to find a hashing method that can effectively produce
lower dimensional binary representation of 512-bit FREAK descriptors. A
representative sample of recent unsupervised, semi-supervised and supervised
hashing methods was experimentally evaluated on large datasets of labelled
binary FREAK feature descriptors
Unsupervised Multi-modal Hashing for Cross-modal retrieval
With the advantage of low storage cost and high efficiency, hashing learning
has received much attention in the domain of Big Data. In this paper, we
propose a novel unsupervised hashing learning method to cope with this open
problem to directly preserve the manifold structure by hashing. To address this
problem, both the semantic correlation in textual space and the locally
geometric structure in the visual space are explored simultaneously in our
framework. Besides, the `2;1-norm constraint is imposed on the projection
matrices to learn the discriminative hash function for each modality. Extensive
experiments are performed to evaluate the proposed method on the three publicly
available datasets and the experimental results show that our method can
achieve superior performance over the state-of-the-art methods.Comment: 4 pages, 4 figure
Exploring Auxiliary Context: Discrete Semantic Transfer Hashing for Scalable Image Retrieval
Unsupervised hashing can desirably support scalable content-based image
retrieval (SCBIR) for its appealing advantages of semantic label independence,
memory and search efficiency. However, the learned hash codes are embedded with
limited discriminative semantics due to the intrinsic limitation of image
representation. To address the problem, in this paper, we propose a novel
hashing approach, dubbed as \emph{Discrete Semantic Transfer Hashing} (DSTH).
The key idea is to \emph{directly} augment the semantics of discrete image hash
codes by exploring auxiliary contextual modalities. To this end, a unified
hashing framework is formulated to simultaneously preserve visual similarities
of images and perform semantic transfer from contextual modalities. Further, to
guarantee direct semantic transfer and avoid information loss, we explicitly
impose the discrete constraint, bit--uncorrelation constraint and bit-balance
constraint on hash codes. A novel and effective discrete optimization method
based on augmented Lagrangian multiplier is developed to iteratively solve the
optimization problem. The whole learning process has linear computation
complexity and desirable scalability. Experiments on three benchmark datasets
demonstrate the superiority of DSTH compared with several state-of-the-art
approaches
Simultaneous Feature Aggregating and Hashing for Large-scale Image Search
In most state-of-the-art hashing-based visual search systems, local image
descriptors of an image are first aggregated as a single feature vector. This
feature vector is then subjected to a hashing function that produces a binary
hash code. In previous work, the aggregating and the hashing processes are
designed independently. In this paper, we propose a novel framework where
feature aggregating and hashing are designed simultaneously and optimized
jointly. Specifically, our joint optimization produces aggregated
representations that can be better reconstructed by some binary codes. This
leads to more discriminative binary hash codes and improved retrieval accuracy.
In addition, we also propose a fast version of the recently-proposed Binary
Autoencoder to be used in our proposed framework. We perform extensive
retrieval experiments on several benchmark datasets with both SIFT and
convolutional features. Our results suggest that the proposed framework
achieves significant improvements over the state of the art.Comment: Accepted to CVPR 201
- …