2,055 research outputs found
A Survey on Learning to Hash
Nearest neighbor search is a problem of finding the data points from the
database such that the distances from them to the query point are the smallest.
Learning to hash is one of the major solutions to this problem and has been
widely studied recently. In this paper, we present a comprehensive survey of
the learning to hash algorithms, categorize them according to the manners of
preserving the similarities into: pairwise similarity preserving, multiwise
similarity preserving, implicit similarity preserving, as well as quantization,
and discuss their relations. We separate quantization from pairwise similarity
preserving as the objective function is very different though quantization, as
we show, can be derived from preserving the pairwise similarities. In addition,
we present the evaluation protocols, and the general performance analysis, and
point out that the quantization algorithms perform superiorly in terms of
search accuracy, search time cost, and space cost. Finally, we introduce a few
emerging topics.Comment: To appear in IEEE Transactions On Pattern Analysis and Machine
Intelligence (TPAMI
SSDH: Semi-supervised Deep Hashing for Large Scale Image Retrieval
Hashing methods have been widely used for efficient similarity retrieval on
large scale image database. Traditional hashing methods learn hash functions to
generate binary codes from hand-crafted features, which achieve limited
accuracy since the hand-crafted features cannot optimally represent the image
content and preserve the semantic similarity. Recently, several deep hashing
methods have shown better performance because the deep architectures generate
more discriminative feature representations. However, these deep hashing
methods are mainly designed for supervised scenarios, which only exploit the
semantic similarity information, but ignore the underlying data structures. In
this paper, we propose the semi-supervised deep hashing (SSDH) approach, to
perform more effective hash function learning by simultaneously preserving
semantic similarity and underlying data structures. The main contributions are
as follows: (1) We propose a semi-supervised loss to jointly minimize the
empirical error on labeled data, as well as the embedding error on both labeled
and unlabeled data, which can preserve the semantic similarity and capture the
meaningful neighbors on the underlying data structures for effective hashing.
(2) A semi-supervised deep hashing network is designed to extensively exploit
both labeled and unlabeled data, in which we propose an online graph
construction method to benefit from the evolving deep features during training
to better capture semantic neighbors. To the best of our knowledge, the
proposed deep network is the first deep hashing method that can perform hash
code learning and feature learning simultaneously in a semi-supervised fashion.
Experimental results on 5 widely-used datasets show that our proposed approach
outperforms the state-of-the-art hashing methods.Comment: 14 pages, accepted by IEEE Transactions on Circuits and Systems for
Video Technolog
Learning to Hash for Indexing Big Data - A Survey
The explosive growth in big data has attracted much attention in designing
efficient indexing and search methods recently. In many critical applications
such as large-scale search and pattern matching, finding the nearest neighbors
to a query is a fundamental research problem. However, the straightforward
solution using exhaustive comparison is infeasible due to the prohibitive
computational complexity and memory requirement. In response, Approximate
Nearest Neighbor (ANN) search based on hashing techniques has become popular
due to its promising performance in both efficiency and accuracy. Prior
randomized hashing methods, e.g., Locality-Sensitive Hashing (LSH), explore
data-independent hash functions with random projections or permutations.
Although having elegant theoretic guarantees on the search quality in certain
metric spaces, performance of randomized hashing has been shown insufficient in
many real-world applications. As a remedy, new approaches incorporating
data-driven learning methods in development of advanced hash functions have
emerged. Such learning to hash methods exploit information such as data
distributions or class labels when optimizing the hash codes or functions.
Importantly, the learned hash codes are able to preserve the proximity of
neighboring data in the original feature spaces in the hash code spaces. The
goal of this paper is to provide readers with systematic understanding of
insights, pros and cons of the emerging techniques. We provide a comprehensive
survey of the learning to hash framework and representative techniques of
various types, including unsupervised, semi-supervised, and supervised. In
addition, we also summarize recent hashing approaches utilizing the deep
learning models. Finally, we discuss the future direction and trends of
research in this area
Improved Search in Hamming Space using Deep Multi-Index Hashing
Similarity-preserving hashing is a widely-used method for nearest neighbour
search in large-scale image retrieval tasks. There has been considerable
research on generating efficient image representation via the
deep-network-based hashing methods. However, the issue of efficient searching
in the deep representation space remains largely unsolved. To this end, we
propose a simple yet efficient deep-network-based multi-index hashing method
for simultaneously learning the powerful image representation and the efficient
searching. To achieve these two goals, we introduce the multi-index hashing
(MIH) mechanism into the proposed deep architecture, which divides the binary
codes into multiple substrings. Due to the non-uniformly distributed codes will
result in inefficiency searching, we add the two balanced constraints at
feature-level and instance-level, respectively. Extensive evaluations on
several benchmark image retrieval datasets show that the learned balanced
binary codes bring dramatic speedups and achieve comparable performance over
the existing baselines
Hashing with Mutual Information
Binary vector embeddings enable fast nearest neighbor retrieval in large
databases of high-dimensional objects, and play an important role in many
practical applications, such as image and video retrieval. We study the problem
of learning binary vector embeddings under a supervised setting, also known as
hashing. We propose a novel supervised hashing method based on optimizing an
information-theoretic quantity: mutual information. We show that optimizing
mutual information can reduce ambiguity in the induced neighborhood structure
in the learned Hamming space, which is essential in obtaining high retrieval
performance. To this end, we optimize mutual information in deep neural
networks with minibatch stochastic gradient descent, with a formulation that
maximally and efficiently utilizes available supervision. Experiments on four
image retrieval benchmarks, including ImageNet, confirm the effectiveness of
our method in learning high-quality binary embeddings for nearest neighbor
retrieval
Deep Policy Hashing Network with Listwise Supervision
Deep-networks-based hashing has become a leading approach for large-scale
image retrieval, which learns a similarity-preserving network to map similar
images to nearby hash codes. The pairwise and triplet losses are two widely
used similarity preserving manners for deep hashing. These manners ignore the
fact that hashing is a prediction task on the list of binary codes. However,
learning deep hashing with listwise supervision is challenging in 1) how to
obtain the rank list of whole training set when the batch size of the deep
network is always small and 2) how to utilize the listwise supervision. In this
paper, we present a novel deep policy hashing architecture with two systems are
learned in parallel: a query network and a shared and slowly changing database
network. The following three steps are repeated until convergence: 1) the
database network encodes all training samples into binary codes to obtain a
whole rank list, 2) the query network is trained based on policy learning to
maximize a reward that indicates the performance of the whole ranking list of
binary codes, e.g., mean average precision (MAP), and 3) the database network
is updated as the query network. Extensive evaluations on several benchmark
datasets show that the proposed method brings substantial improvements over
state-of-the-art hashing methods.Comment: 8 pages, accepted by ACM ICM
Transductive Zero-Shot Hashing via Coarse-to-Fine Similarity Mining
Zero-shot Hashing (ZSH) is to learn hashing models for novel/target classes
without training data, which is an important and challenging problem. Most
existing ZSH approaches exploit transfer learning via an intermediate shared
semantic representations between the seen/source classes and novel/target
classes. However, due to having disjoint, the hash functions learned from the
source dataset are biased when applied directly to the target classes. In this
paper, we study the transductive ZSH, i.e., we have unlabeled data for novel
classes. We put forward a simple yet efficient joint learning approach via
coarse-to-fine similarity mining which transfers knowledges from source data to
target data. It mainly consists of two building blocks in the proposed deep
architecture: 1) a shared two-streams network, which the first stream operates
on the source data and the second stream operates on the unlabeled data, to
learn the effective common image representations, and 2) a coarse-to-fine
module, which begins with finding the most representative images from target
classes and then further detect similarities among these images, to transfer
the similarities of the source data to the target data in a greedy fashion.
Extensive evaluation results on several benchmark datasets demonstrate that the
proposed hashing method achieves significant improvement over the
state-of-the-art methods
Feature Learning based Deep Supervised Hashing with Pairwise Labels
Recent years have witnessed wide application of hashing for large-scale image
retrieval. However, most existing hashing methods are based on hand-crafted
features which might not be optimally compatible with the hashing procedure.
Recently, deep hashing methods have been proposed to perform simultaneous
feature learning and hash-code learning with deep neural networks, which have
shown better performance than traditional hashing methods with hand-crafted
features. Most of these deep hashing methods are supervised whose supervised
information is given with triplet labels. For another common application
scenario with pairwise labels, there have not existed methods for simultaneous
feature learning and hash-code learning. In this paper, we propose a novel deep
hashing method, called deep pairwise-supervised hashing(DPSH), to perform
simultaneous feature learning and hash-code learning for applications with
pairwise labels. Experiments on real datasets show that our DPSH method can
outperform other methods to achieve the state-of-the-art performance in image
retrieval applications.Comment: IJCAI 201
A Comprehensive Survey on Cross-modal Retrieval
In recent years, cross-modal retrieval has drawn much attention due to the
rapid growth of multimodal data. It takes one type of data as the query to
retrieve relevant data of another type. For example, a user can use a text to
retrieve relevant pictures or videos. Since the query and its retrieved results
can be of different modalities, how to measure the content similarity between
different modalities of data remains a challenge. Various methods have been
proposed to deal with such a problem. In this paper, we first review a number
of representative methods for cross-modal retrieval and classify them into two
main groups: 1) real-valued representation learning, and 2) binary
representation learning. Real-valued representation learning methods aim to
learn real-valued common representations for different modalities of data. To
speed up the cross-modal retrieval, a number of binary representation learning
methods are proposed to map different modalities of data into a common Hamming
space. Then, we introduce several multimodal datasets in the community, and
show the experimental results on two commonly used multimodal datasets. The
comparison reveals the characteristic of different kinds of cross-modal
retrieval methods, which is expected to benefit both practical applications and
future research. Finally, we discuss open problems and future research
directions.Comment: 20 pages, 11 figures, 9 table
Deep Ordinal Hashing with Spatial Attention
Hashing has attracted increasing research attentions in recent years due to
its high efficiency of computation and storage in image retrieval. Recent works
have demonstrated the superiority of simultaneous feature representations and
hash functions learning with deep neural networks. However, most existing deep
hashing methods directly learn the hash functions by encoding the global
semantic information, while ignoring the local spatial information of images.
The loss of local spatial structure makes the performance bottleneck of hash
functions, therefore limiting its application for accurate similarity
retrieval. In this work, we propose a novel Deep Ordinal Hashing (DOH) method,
which learns ordinal representations by leveraging the ranking structure of
feature space from both local and global views. In particular, to effectively
build the ranking structure, we propose to learn the rank correlation space by
exploiting the local spatial information from Fully Convolutional Network (FCN)
and the global semantic information from the Convolutional Neural Network (CNN)
simultaneously. More specifically, an effective spatial attention model is
designed to capture the local spatial information by selectively learning
well-specified locations closely related to target objects. In such hashing
framework,the local spatial and global semantic nature of images are captured
in an end-to-end ranking-to-hashing manner. Experimental results conducted on
three widely-used datasets demonstrate that the proposed DOH method
significantly outperforms the state-of-the-art hashing methods
- …