791 research outputs found
Learning to Hash for Indexing Big Data - A Survey
The explosive growth in big data has attracted much attention in designing
efficient indexing and search methods recently. In many critical applications
such as large-scale search and pattern matching, finding the nearest neighbors
to a query is a fundamental research problem. However, the straightforward
solution using exhaustive comparison is infeasible due to the prohibitive
computational complexity and memory requirement. In response, Approximate
Nearest Neighbor (ANN) search based on hashing techniques has become popular
due to its promising performance in both efficiency and accuracy. Prior
randomized hashing methods, e.g., Locality-Sensitive Hashing (LSH), explore
data-independent hash functions with random projections or permutations.
Although having elegant theoretic guarantees on the search quality in certain
metric spaces, performance of randomized hashing has been shown insufficient in
many real-world applications. As a remedy, new approaches incorporating
data-driven learning methods in development of advanced hash functions have
emerged. Such learning to hash methods exploit information such as data
distributions or class labels when optimizing the hash codes or functions.
Importantly, the learned hash codes are able to preserve the proximity of
neighboring data in the original feature spaces in the hash code spaces. The
goal of this paper is to provide readers with systematic understanding of
insights, pros and cons of the emerging techniques. We provide a comprehensive
survey of the learning to hash framework and representative techniques of
various types, including unsupervised, semi-supervised, and supervised. In
addition, we also summarize recent hashing approaches utilizing the deep
learning models. Finally, we discuss the future direction and trends of
research in this area
Targeted Attack for Deep Hashing based Retrieval
The deep hashing based retrieval method is widely adopted in large-scale
image and video retrieval. However, there is little investigation on its
security. In this paper, we propose a novel method, dubbed deep hashing
targeted attack (DHTA), to study the targeted attack on such retrieval.
Specifically, we first formulate the targeted attack as a point-to-set
optimization, which minimizes the average distance between the hash code of an
adversarial example and those of a set of objects with the target label. Then
we design a novel component-voting scheme to obtain an anchor code as the
representative of the set of hash codes of objects with the target label, whose
optimality guarantee is also theoretically derived. To balance the performance
and perceptibility, we propose to minimize the Hamming distance between the
hash code of the adversarial example and the anchor code under the
restriction on the perturbation. Extensive experiments verify
that DHTA is effective in attacking both deep hashing based image retrieval and
video retrieval.Comment: Accepted by ECCV 2020 as Ora
Exploring Auxiliary Context: Discrete Semantic Transfer Hashing for Scalable Image Retrieval
Unsupervised hashing can desirably support scalable content-based image
retrieval (SCBIR) for its appealing advantages of semantic label independence,
memory and search efficiency. However, the learned hash codes are embedded with
limited discriminative semantics due to the intrinsic limitation of image
representation. To address the problem, in this paper, we propose a novel
hashing approach, dubbed as \emph{Discrete Semantic Transfer Hashing} (DSTH).
The key idea is to \emph{directly} augment the semantics of discrete image hash
codes by exploring auxiliary contextual modalities. To this end, a unified
hashing framework is formulated to simultaneously preserve visual similarities
of images and perform semantic transfer from contextual modalities. Further, to
guarantee direct semantic transfer and avoid information loss, we explicitly
impose the discrete constraint, bit--uncorrelation constraint and bit-balance
constraint on hash codes. A novel and effective discrete optimization method
based on augmented Lagrangian multiplier is developed to iteratively solve the
optimization problem. The whole learning process has linear computation
complexity and desirable scalability. Experiments on three benchmark datasets
demonstrate the superiority of DSTH compared with several state-of-the-art
approaches
Triplet-Based Deep Hashing Network for Cross-Modal Retrieval
Given the benefits of its low storage requirements and high retrieval
efficiency, hashing has recently received increasing attention. In
particular,cross-modal hashing has been widely and successfully used in
multimedia similarity search applications. However, almost all existing methods
employing cross-modal hashing cannot obtain powerful hash codes due to their
ignoring the relative similarity between heterogeneous data that contains
richer semantic information, leading to unsatisfactory retrieval performance.
In this paper, we propose a triplet-based deep hashing (TDH) network for
cross-modal retrieval. First, we utilize the triplet labels, which describes
the relative relationships among three instances as supervision in order to
capture more general semantic correlations between cross-modal instances. We
then establish a loss function from the inter-modal view and the intra-modal
view to boost the discriminative abilities of the hash codes. Finally, graph
regularization is introduced into our proposed TDH method to preserve the
original semantic similarity between hash codes in Hamming space. Experimental
results show that our proposed method outperforms several state-of-the-art
approaches on two popular cross-modal datasets
Unsupervised Multi-modal Hashing for Cross-modal retrieval
With the advantage of low storage cost and high efficiency, hashing learning
has received much attention in the domain of Big Data. In this paper, we
propose a novel unsupervised hashing learning method to cope with this open
problem to directly preserve the manifold structure by hashing. To address this
problem, both the semantic correlation in textual space and the locally
geometric structure in the visual space are explored simultaneously in our
framework. Besides, the `2;1-norm constraint is imposed on the projection
matrices to learn the discriminative hash function for each modality. Extensive
experiments are performed to evaluate the proposed method on the three publicly
available datasets and the experimental results show that our method can
achieve superior performance over the state-of-the-art methods.Comment: 4 pages, 4 figure
Supervised Discrete Hashing with Relaxation
Data-dependent hashing has recently attracted attention due to being able to
support efficient retrieval and storage of high-dimensional data such as
documents, images, and videos. In this paper, we propose a novel learning-based
hashing method called "Supervised Discrete Hashing with Relaxation" (SDHR)
based on "Supervised Discrete Hashing" (SDH). SDH uses ordinary least squares
regression and traditional zero-one matrix encoding of class label information
as the regression target (code words), thus fixing the regression target. In
SDHR, the regression target is instead optimized. The optimized regression
target matrix satisfies a large margin constraint for correct classification of
each example. Compared with SDH, which uses the traditional zero-one matrix,
SDHR utilizes the learned regression target matrix and, therefore, more
accurately measures the classification error of the regression model and is
more flexible. As expected, SDHR generally outperforms SDH. Experimental
results on two large-scale image datasets (CIFAR-10 and MNIST) and a
large-scale and challenging face dataset (FRGC) demonstrate the effectiveness
and efficiency of SDHR
Social Anchor-Unit Graph Regularized Tensor Completion for Large-Scale Image Retagging
Image retagging aims to improve tag quality of social images by refining
their original tags or assigning new high-quality tags. Recent approaches
simultaneously explore visual, user and tag information to improve the
performance of image retagging by constructing and exploring an image-tag-user
graph. However, such methods will become computationally infeasible with the
rapidly increasing number of images, tags and users. It has been proven that
Anchor Graph Regularization (AGR) can significantly accelerate large-scale
graph learning model by exploring only a small number of anchor points.
Inspired by this, we propose a novel Social anchor-Unit GrAph Regularized
Tensor Completion (SUGAR-TC) method to effectively refine the tags of social
images, which is insensitive to the scale of the applied data. First, we
construct an anchor-unit graph across multiple domains (e.g., image and user
domains) rather than traditional anchor graph in a single domain. Second, a
tensor completion based on SUGAR is implemented on the original image-tag-user
tensor to refine the tags of the anchor images. Third, we efficiently assign
tags to non-anchor images by leveraging the relationship between the non-anchor
images and the anchor units. Experimental results on a real-world social image
database well demonstrate the effectiveness of SUGAR-TC, outperforming several
related methods
Semi-supervised Multimodal Hashing
Retrieving nearest neighbors across correlated data in multiple modalities,
such as image-text pairs on Facebook and video-tag pairs on YouTube, has become
a challenging task due to the huge amount of data. Multimodal hashing methods
that embed data into binary codes can boost the retrieving speed and reduce
storage requirement. As unsupervised multimodal hashing methods are usually
inferior to supervised ones, while the supervised ones requires too much
manually labeled data, the proposed method in this paper utilizes a part of
labels to design a semi-supervised multimodal hashing method. It first computes
the transformation matrices for data matrices and label matrix. Then, with
these transformation matrices, fuzzy logic is introduced to estimate a label
matrix for unlabeled data. Finally, it uses the estimated label matrix to learn
hashing functions for data in each modality to generate a unified binary code
matrix. Experiments show that the proposed semi-supervised method with 50%
labels can get a medium performance among the compared supervised ones and
achieve an approximate performance to the best supervised method with 90%
labels. With only 10% labels, the proposed method can still compete with the
worst compared supervised one
SSDH: Semi-supervised Deep Hashing for Large Scale Image Retrieval
Hashing methods have been widely used for efficient similarity retrieval on
large scale image database. Traditional hashing methods learn hash functions to
generate binary codes from hand-crafted features, which achieve limited
accuracy since the hand-crafted features cannot optimally represent the image
content and preserve the semantic similarity. Recently, several deep hashing
methods have shown better performance because the deep architectures generate
more discriminative feature representations. However, these deep hashing
methods are mainly designed for supervised scenarios, which only exploit the
semantic similarity information, but ignore the underlying data structures. In
this paper, we propose the semi-supervised deep hashing (SSDH) approach, to
perform more effective hash function learning by simultaneously preserving
semantic similarity and underlying data structures. The main contributions are
as follows: (1) We propose a semi-supervised loss to jointly minimize the
empirical error on labeled data, as well as the embedding error on both labeled
and unlabeled data, which can preserve the semantic similarity and capture the
meaningful neighbors on the underlying data structures for effective hashing.
(2) A semi-supervised deep hashing network is designed to extensively exploit
both labeled and unlabeled data, in which we propose an online graph
construction method to benefit from the evolving deep features during training
to better capture semantic neighbors. To the best of our knowledge, the
proposed deep network is the first deep hashing method that can perform hash
code learning and feature learning simultaneously in a semi-supervised fashion.
Experimental results on 5 widely-used datasets show that our proposed approach
outperforms the state-of-the-art hashing methods.Comment: 14 pages, accepted by IEEE Transactions on Circuits and Systems for
Video Technolog
DistillHash: Unsupervised Deep Hashing by Distilling Data Pairs
Due to the high storage and search efficiency, hashing has become prevalent
for large-scale similarity search. Particularly, deep hashing methods have
greatly improved the search performance under supervised scenarios. In
contrast, unsupervised deep hashing models can hardly achieve satisfactory
performance due to the lack of reliable supervisory similarity signals. To
address this issue, we propose a novel deep unsupervised hashing model, dubbed
DistillHash, which can learn a distilled data set consisted of data pairs,
which have confidence similarity signals. Specifically, we investigate the
relationship between the initial noisy similarity signals learned from local
structures and the semantic similarity labels assigned by a Bayes optimal
classifier. We show that under a mild assumption, some data pairs, of which
labels are consistent with those assigned by the Bayes optimal classifier, can
be potentially distilled. Inspired by this fact, we design a simple yet
effective strategy to distill data pairs automatically and further adopt a
Bayesian learning framework to learn hash functions from the distilled data
set. Extensive experimental results on three widely used benchmark datasets
show that the proposed DistillHash consistently accomplishes the
state-of-the-art search performance
- …