232 research outputs found
Unsupervised Multi-modal Hashing for Cross-modal retrieval
With the advantage of low storage cost and high efficiency, hashing learning
has received much attention in the domain of Big Data. In this paper, we
propose a novel unsupervised hashing learning method to cope with this open
problem to directly preserve the manifold structure by hashing. To address this
problem, both the semantic correlation in textual space and the locally
geometric structure in the visual space are explored simultaneously in our
framework. Besides, the `2;1-norm constraint is imposed on the projection
matrices to learn the discriminative hash function for each modality. Extensive
experiments are performed to evaluate the proposed method on the three publicly
available datasets and the experimental results show that our method can
achieve superior performance over the state-of-the-art methods.Comment: 4 pages, 4 figure
A Survey on Learning to Hash
Nearest neighbor search is a problem of finding the data points from the
database such that the distances from them to the query point are the smallest.
Learning to hash is one of the major solutions to this problem and has been
widely studied recently. In this paper, we present a comprehensive survey of
the learning to hash algorithms, categorize them according to the manners of
preserving the similarities into: pairwise similarity preserving, multiwise
similarity preserving, implicit similarity preserving, as well as quantization,
and discuss their relations. We separate quantization from pairwise similarity
preserving as the objective function is very different though quantization, as
we show, can be derived from preserving the pairwise similarities. In addition,
we present the evaluation protocols, and the general performance analysis, and
point out that the quantization algorithms perform superiorly in terms of
search accuracy, search time cost, and space cost. Finally, we introduce a few
emerging topics.Comment: To appear in IEEE Transactions On Pattern Analysis and Machine
Intelligence (TPAMI
Deep Sketch Hashing: Fast Free-hand Sketch-Based Image Retrieval
Free-hand sketch-based image retrieval (SBIR) is a specific cross-view
retrieval task, in which queries are abstract and ambiguous sketches while the
retrieval database is formed with natural images. Work in this area mainly
focuses on extracting representative and shared features for sketches and
natural images. However, these can neither cope well with the geometric
distortion between sketches and images nor be feasible for large-scale SBIR due
to the heavy continuous-valued distance computation. In this paper, we speed up
SBIR by introducing a novel binary coding method, named \textbf{Deep Sketch
Hashing} (DSH), where a semi-heterogeneous deep architecture is proposed and
incorporated into an end-to-end binary coding framework. Specifically, three
convolutional neural networks are utilized to encode free-hand sketches,
natural images and, especially, the auxiliary sketch-tokens which are adopted
as bridges to mitigate the sketch-image geometric distortion. The learned DSH
codes can effectively capture the cross-view similarities as well as the
intrinsic semantic correlations between different categories. To the best of
our knowledge, DSH is the first hashing work specifically designed for
category-level SBIR with an end-to-end deep architecture. The proposed DSH is
comprehensively evaluated on two large-scale datasets of TU-Berlin Extension
and Sketchy, and the experiments consistently show DSH's superior SBIR
accuracies over several state-of-the-art methods, while achieving significantly
reduced retrieval time and memory footprint.Comment: This paper will appear as a spotlight paper in CVPR201
Shared Predictive Cross-Modal Deep Quantization
With explosive growth of data volume and ever-increasing diversity of data
modalities, cross-modal similarity search, which conducts nearest neighbor
search across different modalities, has been attracting increasing interest.
This paper presents a deep compact code learning solution for efficient
cross-modal similarity search. Many recent studies have proven that
quantization-based approaches perform generally better than hashing-based
approaches on single-modal similarity search. In this paper, we propose a deep
quantization approach, which is among the early attempts of leveraging deep
neural networks into quantization-based cross-modal similarity search. Our
approach, dubbed shared predictive deep quantization (SPDQ), explicitly
formulates a shared subspace across different modalities and two private
subspaces for individual modalities, and representations in the shared subspace
and the private subspaces are learned simultaneously by embedding them to a
reproducing kernel Hilbert space, where the mean embedding of different
modality distributions can be explicitly compared. In addition, in the shared
subspace, a quantizer is learned to produce the semantics preserving compact
codes with the help of label alignment. Thanks to this novel network
architecture in cooperation with supervised quantization training, SPDQ can
preserve intramodal and intermodal similarities as much as possible and greatly
reduce quantization error. Experiments on two popular benchmarks corroborate
that our approach outperforms state-of-the-art methods
A Decade Survey of Content Based Image Retrieval using Deep Learning
The content based image retrieval aims to find the similar images from a
large scale dataset against a query image. Generally, the similarity between
the representative features of the query image and dataset images is used to
rank the images for retrieval. In early days, various hand designed feature
descriptors have been investigated based on the visual cues such as color,
texture, shape, etc. that represent the images. However, the deep learning has
emerged as a dominating alternative of hand-designed feature engineering from a
decade. It learns the features automatically from the data. This paper presents
a comprehensive survey of deep learning based developments in the past decade
for content based image retrieval. The categorization of existing
state-of-the-art methods from different perspectives is also performed for
greater understanding of the progress. The taxonomy used in this survey covers
different supervision, different networks, different descriptor type and
different retrieval type. A performance analysis is also performed using the
state-of-the-art methods. The insights are also presented for the benefit of
the researchers to observe the progress and to make the best choices. The
survey presented in this paper will help in further research progress in image
retrieval using deep learning
Online Hashing
Although hash function learning algorithms have achieved great success in
recent years, most existing hash models are off-line, which are not suitable
for processing sequential or online data. To address this problem, this work
proposes an online hash model to accommodate data coming in stream for online
learning. Specifically, a new loss function is proposed to measure the
similarity loss between a pair of data samples in hamming space. Then, a
structured hash model is derived and optimized in a passive-aggressive way.
Theoretical analysis on the upper bound of the cumulative loss for the proposed
online hash model is provided. Furthermore, we extend our online hashing from a
single-model to a multi-model online hashing that trains multiple models so as
to retain diverse online hashing models in order to avoid biased update. The
competitive efficiency and effectiveness of the proposed online hash models are
verified through extensive experiments on several large-scale datasets as
compared to related hashing methods.Comment: To appear in IEEE Transactions on Neural Networks and Learning
Systems (DOI: 10.1109/TNNLS.2017.2689242
Improved Deep Hashing with Soft Pairwise Similarity for Multi-label Image Retrieval
Hash coding has been widely used in the approximate nearest neighbor search
for large-scale image retrieval. Recently, many deep hashing methods have been
proposed and shown largely improved performance over traditional
feature-learning-based methods. Most of these methods examine the pairwise
similarity on the semantic-level labels, where the pairwise similarity is
generally defined in a hard-assignment way. That is, the pairwise similarity is
'1' if they share no less than one class label and '0' if they do not share
any. However, such similarity definition cannot reflect the similarity ranking
for pairwise images that hold multiple labels. In this paper, a new deep
hashing method is proposed for multi-label image retrieval by re-defining the
pairwise similarity into an instance similarity, where the instance similarity
is quantified into a percentage based on the normalized semantic labels. Based
on the instance similarity, a weighted cross-entropy loss and a minimum mean
square error loss are tailored for loss-function construction, and are
efficiently used for simultaneous feature learning and hash coding. Experiments
on three popular datasets demonstrate that, the proposed method outperforms the
competing methods and achieves the state-of-the-art performance in multi-label
image retrieval
Semantic Cluster Unary Loss for Efficient Deep Hashing
Hashing method maps similar data to binary hashcodes with smaller hamming
distance, which has received a broad attention due to its low storage cost and
fast retrieval speed. With the rapid development of deep learning, deep hashing
methods have achieved promising results in efficient information retrieval.
Most of the existing deep hashing methods adopt pairwise or triplet losses to
deal with similarities underlying the data, but the training is difficult and
less efficient because data pairs and triplets are involved.
To address these issues, we propose a novel deep hashing algorithm with unary
loss which can be trained very efficiently. We first of all introduce a Unary
Upper Bound of the traditional triplet loss, thus reducing the complexity to
and bridging the classification-based unary loss and the triplet loss.
Second, we propose a novel Semantic Cluster Deep Hashing (SCDH) algorithm by
introducing a modified Unary Upper Bound loss, named Semantic Cluster Unary
Loss (SCUL). The resultant hashcodes form several compact clusters, which means
hashcodes in the same cluster have similar semantic information. We also
demonstrate that the proposed SCDH is easy to be extended to semi-supervised
settings by incorporating the state-of-the-art semi-supervised learning
algorithms. Experiments on large-scale datasets show that the proposed method
is superior to state-of-the-art hashing algorithms.Comment: 13 page
Deep Multi-View Enhancement Hashing for Image Retrieval
Hashing is an efficient method for nearest neighbor search in large-scale
data space by embedding high-dimensional feature descriptors into a similarity
preserving Hamming space with a low dimension. However, large-scale high-speed
retrieval through binary code has a certain degree of reduction in retrieval
accuracy compared to traditional retrieval methods. We have noticed that
multi-view methods can well preserve the diverse characteristics of data.
Therefore, we try to introduce the multi-view deep neural network into the hash
learning field, and design an efficient and innovative retrieval model, which
has achieved a significant improvement in retrieval performance. In this paper,
we propose a supervised multi-view hash model which can enhance the multi-view
information through neural networks. This is a completely new hash learning
method that combines multi-view and deep learning methods. The proposed method
utilizes an effective view stability evaluation method to actively explore the
relationship among views, which will affect the optimization direction of the
entire network. We have also designed a variety of multi-data fusion methods in
the Hamming space to preserve the advantages of both convolution and
multi-view. In order to avoid excessive computing resources on the enhancement
procedure during retrieval, we set up a separate structure called memory
network which participates in training together. The proposed method is
systematically evaluated on the CIFAR-10, NUS-WIDE and MS-COCO datasets, and
the results show that our method significantly outperforms the state-of-the-art
single-view and multi-view hashing methods
Rank-Consistency Deep Hashing for Scalable Multi-Label Image Search
As hashing becomes an increasingly appealing technique for large-scale image
retrieval, multi-label hashing is also attracting more attention for the
ability to exploit multi-level semantic contents. In this paper, we propose a
novel deep hashing method for scalable multi-label image search. Unlike
existing approaches with conventional objectives such as contrast and triplet
losses, we employ a rank list, rather than pairs or triplets, to provide
sufficient global supervision information for all the samples. Specifically, a
new rank-consistency objective is applied to align the similarity orders from
two spaces, the original space and the hamming space. A powerful loss function
is designed to penalize the samples whose semantic similarity and hamming
distance are mismatched in two spaces. Besides, a multi-label softmax
cross-entropy loss is presented to enhance the discriminative power with a
concise formulation of the derivative function. In order to manipulate the
neighborhood structure of the samples with different labels, we design a
multi-label clustering loss to cluster the hashing vectors of the samples with
the same labels by reducing the distances between the samples and their
multiple corresponding class centers. The state-of-the-art experimental results
achieved on three public multi-label datasets, MIRFLICKR-25K, IAPRTC12 and
NUS-WIDE, demonstrate the effectiveness of the proposed method
- …