1,285 research outputs found
Exploring Auxiliary Context: Discrete Semantic Transfer Hashing for Scalable Image Retrieval
Unsupervised hashing can desirably support scalable content-based image
retrieval (SCBIR) for its appealing advantages of semantic label independence,
memory and search efficiency. However, the learned hash codes are embedded with
limited discriminative semantics due to the intrinsic limitation of image
representation. To address the problem, in this paper, we propose a novel
hashing approach, dubbed as \emph{Discrete Semantic Transfer Hashing} (DSTH).
The key idea is to \emph{directly} augment the semantics of discrete image hash
codes by exploring auxiliary contextual modalities. To this end, a unified
hashing framework is formulated to simultaneously preserve visual similarities
of images and perform semantic transfer from contextual modalities. Further, to
guarantee direct semantic transfer and avoid information loss, we explicitly
impose the discrete constraint, bit--uncorrelation constraint and bit-balance
constraint on hash codes. A novel and effective discrete optimization method
based on augmented Lagrangian multiplier is developed to iteratively solve the
optimization problem. The whole learning process has linear computation
complexity and desirable scalability. Experiments on three benchmark datasets
demonstrate the superiority of DSTH compared with several state-of-the-art
approaches
A Comprehensive Survey on Cross-modal Retrieval
In recent years, cross-modal retrieval has drawn much attention due to the
rapid growth of multimodal data. It takes one type of data as the query to
retrieve relevant data of another type. For example, a user can use a text to
retrieve relevant pictures or videos. Since the query and its retrieved results
can be of different modalities, how to measure the content similarity between
different modalities of data remains a challenge. Various methods have been
proposed to deal with such a problem. In this paper, we first review a number
of representative methods for cross-modal retrieval and classify them into two
main groups: 1) real-valued representation learning, and 2) binary
representation learning. Real-valued representation learning methods aim to
learn real-valued common representations for different modalities of data. To
speed up the cross-modal retrieval, a number of binary representation learning
methods are proposed to map different modalities of data into a common Hamming
space. Then, we introduce several multimodal datasets in the community, and
show the experimental results on two commonly used multimodal datasets. The
comparison reveals the characteristic of different kinds of cross-modal
retrieval methods, which is expected to benefit both practical applications and
future research. Finally, we discuss open problems and future research
directions.Comment: 20 pages, 11 figures, 9 table
Semantic Cluster Unary Loss for Efficient Deep Hashing
Hashing method maps similar data to binary hashcodes with smaller hamming
distance, which has received a broad attention due to its low storage cost and
fast retrieval speed. With the rapid development of deep learning, deep hashing
methods have achieved promising results in efficient information retrieval.
Most of the existing deep hashing methods adopt pairwise or triplet losses to
deal with similarities underlying the data, but the training is difficult and
less efficient because data pairs and triplets are involved.
To address these issues, we propose a novel deep hashing algorithm with unary
loss which can be trained very efficiently. We first of all introduce a Unary
Upper Bound of the traditional triplet loss, thus reducing the complexity to
and bridging the classification-based unary loss and the triplet loss.
Second, we propose a novel Semantic Cluster Deep Hashing (SCDH) algorithm by
introducing a modified Unary Upper Bound loss, named Semantic Cluster Unary
Loss (SCUL). The resultant hashcodes form several compact clusters, which means
hashcodes in the same cluster have similar semantic information. We also
demonstrate that the proposed SCDH is easy to be extended to semi-supervised
settings by incorporating the state-of-the-art semi-supervised learning
algorithms. Experiments on large-scale datasets show that the proposed method
is superior to state-of-the-art hashing algorithms.Comment: 13 page
DeepHash: Getting Regularization, Depth and Fine-Tuning Right
This work focuses on representing very high-dimensional global image
descriptors using very compact 64-1024 bit binary hashes for instance
retrieval. We propose DeepHash: a hashing scheme based on deep networks. Key to
making DeepHash work at extremely low bitrates are three important
considerations -- regularization, depth and fine-tuning -- each requiring
solutions specific to the hashing problem. In-depth evaluation shows that our
scheme consistently outperforms state-of-the-art methods across all data sets
for both Fisher Vectors and Deep Convolutional Neural Network features, by up
to 20 percent over other schemes. The retrieval performance with 256-bit hashes
is close to that of the uncompressed floating point features -- a remarkable
512 times compression
Learning to Hash for Indexing Big Data - A Survey
The explosive growth in big data has attracted much attention in designing
efficient indexing and search methods recently. In many critical applications
such as large-scale search and pattern matching, finding the nearest neighbors
to a query is a fundamental research problem. However, the straightforward
solution using exhaustive comparison is infeasible due to the prohibitive
computational complexity and memory requirement. In response, Approximate
Nearest Neighbor (ANN) search based on hashing techniques has become popular
due to its promising performance in both efficiency and accuracy. Prior
randomized hashing methods, e.g., Locality-Sensitive Hashing (LSH), explore
data-independent hash functions with random projections or permutations.
Although having elegant theoretic guarantees on the search quality in certain
metric spaces, performance of randomized hashing has been shown insufficient in
many real-world applications. As a remedy, new approaches incorporating
data-driven learning methods in development of advanced hash functions have
emerged. Such learning to hash methods exploit information such as data
distributions or class labels when optimizing the hash codes or functions.
Importantly, the learned hash codes are able to preserve the proximity of
neighboring data in the original feature spaces in the hash code spaces. The
goal of this paper is to provide readers with systematic understanding of
insights, pros and cons of the emerging techniques. We provide a comprehensive
survey of the learning to hash framework and representative techniques of
various types, including unsupervised, semi-supervised, and supervised. In
addition, we also summarize recent hashing approaches utilizing the deep
learning models. Finally, we discuss the future direction and trends of
research in this area
Collaborative Learning for Extremely Low Bit Asymmetric Hashing
Hashing techniques are in great demand for a wide range of real-world
applications such as image retrieval and network compression. Nevertheless,
existing approaches could hardly guarantee a satisfactory performance with the
extremely low-bit (e.g., 4-bit) hash codes due to the severe information loss
and the shrink of the discrete solution space. In this paper, we propose a
novel \textit{Collaborative Learning} strategy that is tailored for generating
high-quality low-bit hash codes. The core idea is to jointly distill
bit-specific and informative representations for a group of pre-defined code
lengths. The learning of short hash codes among the group can benefit from the
manifold shared with other long codes, where multiple views from different hash
codes provide the supplementary guidance and regularization, making the
convergence faster and more stable. To achieve that, an asymmetric hashing
framework with two variants of multi-head embedding structures is derived,
termed as Multi-head Asymmetric Hashing (MAH), leading to great efficiency of
training and querying. Extensive experiments on three benchmark datasets have
been conducted to verify the superiority of the proposed MAH, and have shown
that the 8-bit hash codes generated by MAH achieve of the MAP (Mean
Average Precision (MAP)) score on the CIFAR-10 dataset, which significantly
surpasses the performance of the 48-bit codes by the state-of-the-arts in image
retrieval tasks
Supervised Discrete Hashing with Relaxation
Data-dependent hashing has recently attracted attention due to being able to
support efficient retrieval and storage of high-dimensional data such as
documents, images, and videos. In this paper, we propose a novel learning-based
hashing method called "Supervised Discrete Hashing with Relaxation" (SDHR)
based on "Supervised Discrete Hashing" (SDH). SDH uses ordinary least squares
regression and traditional zero-one matrix encoding of class label information
as the regression target (code words), thus fixing the regression target. In
SDHR, the regression target is instead optimized. The optimized regression
target matrix satisfies a large margin constraint for correct classification of
each example. Compared with SDH, which uses the traditional zero-one matrix,
SDHR utilizes the learned regression target matrix and, therefore, more
accurately measures the classification error of the regression model and is
more flexible. As expected, SDHR generally outperforms SDH. Experimental
results on two large-scale image datasets (CIFAR-10 and MNIST) and a
large-scale and challenging face dataset (FRGC) demonstrate the effectiveness
and efficiency of SDHR
Recent Advance in Content-based Image Retrieval: A Literature Survey
The explosive increase and ubiquitous accessibility of visual data on the Web
have led to the prosperity of research activity in image search or retrieval.
With the ignorance of visual content as a ranking clue, methods with text
search techniques for visual retrieval may suffer inconsistency between the
text words and visual content. Content-based image retrieval (CBIR), which
makes use of the representation of visual content to identify relevant images,
has attracted sustained attention in recent two decades. Such a problem is
challenging due to the intention gap and the semantic gap problems. Numerous
techniques have been developed for content-based image retrieval in the last
decade. The purpose of this paper is to categorize and evaluate those
algorithms proposed during the period of 2003 to 2016. We conclude with several
promising directions for future research.Comment: 22 page
Improved Deep Hashing with Soft Pairwise Similarity for Multi-label Image Retrieval
Hash coding has been widely used in the approximate nearest neighbor search
for large-scale image retrieval. Recently, many deep hashing methods have been
proposed and shown largely improved performance over traditional
feature-learning-based methods. Most of these methods examine the pairwise
similarity on the semantic-level labels, where the pairwise similarity is
generally defined in a hard-assignment way. That is, the pairwise similarity is
'1' if they share no less than one class label and '0' if they do not share
any. However, such similarity definition cannot reflect the similarity ranking
for pairwise images that hold multiple labels. In this paper, a new deep
hashing method is proposed for multi-label image retrieval by re-defining the
pairwise similarity into an instance similarity, where the instance similarity
is quantified into a percentage based on the normalized semantic labels. Based
on the instance similarity, a weighted cross-entropy loss and a minimum mean
square error loss are tailored for loss-function construction, and are
efficiently used for simultaneous feature learning and hash coding. Experiments
on three popular datasets demonstrate that, the proposed method outperforms the
competing methods and achieves the state-of-the-art performance in multi-label
image retrieval
Deep LDA Hashing
The conventional supervised hashing methods based on classification do not
entirely meet the requirements of hashing technique, but Linear Discriminant
Analysis (LDA) does. In this paper, we propose to perform a revised LDA
objective over deep networks to learn efficient hashing codes in a truly
end-to-end fashion. However, the complicated eigenvalue decomposition within
each mini-batch in every epoch has to be faced with when simply optimizing the
deep network w.r.t. the LDA objective. In this work, the revised LDA objective
is transformed into a simple least square problem, which naturally overcomes
the intractable problems and can be easily solved by the off-the-shelf
optimizer. Such deep extension can also overcome the weakness of LDA Hashing in
the limited linear projection and feature learning. Amounts of experiments are
conducted on three benchmark datasets. The proposed Deep LDA Hashing shows
nearly 70 points improvement over the conventional one on the CIFAR-10 dataset.
It also beats several state-of-the-art methods on various metrics.Comment: 10 pages, 3 figure
- …