3,752 research outputs found
Fusion-supervised Deep Cross-modal Hashing
Deep hashing has recently received attention in cross-modal retrieval for its
impressive advantages. However, existing hashing methods for cross-modal
retrieval cannot fully capture the heterogeneous multi-modal correlation and
exploit the semantic information. In this paper, we propose a novel
\emph{Fusion-supervised Deep Cross-modal Hashing} (FDCH) approach. Firstly,
FDCH learns unified binary codes through a fusion hash network with paired
samples as input, which effectively enhances the modeling of the correlation of
heterogeneous multi-modal data. Then, these high-quality unified hash codes
further supervise the training of the modality-specific hash networks for
encoding out-of-sample queries. Meanwhile, both pair-wise similarity
information and classification information are embedded in the hash networks
under one stream framework, which simultaneously preserves cross-modal
similarity and keeps semantic consistency. Experimental results on two
benchmark datasets demonstrate the state-of-the-art performance of FDCH
Deep LDA Hashing
The conventional supervised hashing methods based on classification do not
entirely meet the requirements of hashing technique, but Linear Discriminant
Analysis (LDA) does. In this paper, we propose to perform a revised LDA
objective over deep networks to learn efficient hashing codes in a truly
end-to-end fashion. However, the complicated eigenvalue decomposition within
each mini-batch in every epoch has to be faced with when simply optimizing the
deep network w.r.t. the LDA objective. In this work, the revised LDA objective
is transformed into a simple least square problem, which naturally overcomes
the intractable problems and can be easily solved by the off-the-shelf
optimizer. Such deep extension can also overcome the weakness of LDA Hashing in
the limited linear projection and feature learning. Amounts of experiments are
conducted on three benchmark datasets. The proposed Deep LDA Hashing shows
nearly 70 points improvement over the conventional one on the CIFAR-10 dataset.
It also beats several state-of-the-art methods on various metrics.Comment: 10 pages, 3 figure
Discrete Hashing with Deep Neural Network
This paper addresses the problem of learning binary hash codes for large
scale image search by proposing a novel hashing method based on deep neural
network. The advantage of our deep model over previous deep model used in
hashing is that our model contains necessary criteria for producing good codes
such as similarity preserving, balance and independence. Another advantage of
our method is that instead of relaxing the binary constraint of codes during
the learning process as most previous works, in this paper, by introducing the
auxiliary variable, we reformulate the optimization into two sub-optimization
steps allowing us to efficiently solve binary constraints without any
relaxation.
The proposed method is also extended to the supervised hashing by leveraging
the label information such that the learned binary codes preserve the pairwise
label of inputs.
The experimental results on three benchmark datasets show the proposed
methods outperform state-of-the-art hashing methods
Discriminative Supervised Hashing for Cross-Modal similarity Search
With the advantage of low storage cost and high retrieval efficiency, hashing
techniques have recently been an emerging topic in cross-modal similarity
search. As multiple modal data reflect similar semantic content, many
researches aim at learning unified binary codes. However, discriminative
hashing features learned by these methods are not adequate. This results in
lower accuracy and robustness. We propose a novel hashing learning framework
which jointly performs classifier learning, subspace learning and matrix
factorization to preserve class-specific semantic content, termed
Discriminative Supervised Hashing (DSH), to learn the discrimative unified
binary codes for multi-modal data. Besides, reducing the loss of information
and preserving the non-linear structure of data, DSH non-linearly projects
different modalities into the common space in which the similarity among
heterogeneous data points can be measured. Extensive experiments conducted on
the three publicly available datasets demonstrate that the framework proposed
in this paper outperforms several state-of -the-art methods.Comment: 7 pages,3 figures,4 tables;The paper is under consideration at Image
and Vision Computin
Learning to Hash for Indexing Big Data - A Survey
The explosive growth in big data has attracted much attention in designing
efficient indexing and search methods recently. In many critical applications
such as large-scale search and pattern matching, finding the nearest neighbors
to a query is a fundamental research problem. However, the straightforward
solution using exhaustive comparison is infeasible due to the prohibitive
computational complexity and memory requirement. In response, Approximate
Nearest Neighbor (ANN) search based on hashing techniques has become popular
due to its promising performance in both efficiency and accuracy. Prior
randomized hashing methods, e.g., Locality-Sensitive Hashing (LSH), explore
data-independent hash functions with random projections or permutations.
Although having elegant theoretic guarantees on the search quality in certain
metric spaces, performance of randomized hashing has been shown insufficient in
many real-world applications. As a remedy, new approaches incorporating
data-driven learning methods in development of advanced hash functions have
emerged. Such learning to hash methods exploit information such as data
distributions or class labels when optimizing the hash codes or functions.
Importantly, the learned hash codes are able to preserve the proximity of
neighboring data in the original feature spaces in the hash code spaces. The
goal of this paper is to provide readers with systematic understanding of
insights, pros and cons of the emerging techniques. We provide a comprehensive
survey of the learning to hash framework and representative techniques of
various types, including unsupervised, semi-supervised, and supervised. In
addition, we also summarize recent hashing approaches utilizing the deep
learning models. Finally, we discuss the future direction and trends of
research in this area
Supervised Discrete Hashing with Relaxation
Data-dependent hashing has recently attracted attention due to being able to
support efficient retrieval and storage of high-dimensional data such as
documents, images, and videos. In this paper, we propose a novel learning-based
hashing method called "Supervised Discrete Hashing with Relaxation" (SDHR)
based on "Supervised Discrete Hashing" (SDH). SDH uses ordinary least squares
regression and traditional zero-one matrix encoding of class label information
as the regression target (code words), thus fixing the regression target. In
SDHR, the regression target is instead optimized. The optimized regression
target matrix satisfies a large margin constraint for correct classification of
each example. Compared with SDH, which uses the traditional zero-one matrix,
SDHR utilizes the learned regression target matrix and, therefore, more
accurately measures the classification error of the regression model and is
more flexible. As expected, SDHR generally outperforms SDH. Experimental
results on two large-scale image datasets (CIFAR-10 and MNIST) and a
large-scale and challenging face dataset (FRGC) demonstrate the effectiveness
and efficiency of SDHR
Optimizing affinity-based binary hashing using auxiliary coordinates
In supervised binary hashing, one wants to learn a function that maps a
high-dimensional feature vector to a vector of binary codes, for application to
fast image retrieval. This typically results in a difficult optimization
problem, nonconvex and nonsmooth, because of the discrete variables involved.
Much work has simply relaxed the problem during training, solving a continuous
optimization, and truncating the codes a posteriori. This gives reasonable
results but is quite suboptimal. Recent work has tried to optimize the
objective directly over the binary codes and achieved better results, but the
hash function was still learned a posteriori, which remains suboptimal. We
propose a general framework for learning hash functions using affinity-based
loss functions that uses auxiliary coordinates. This closes the loop and
optimizes jointly over the hash functions and the binary codes so that they
gradually match each other. The resulting algorithm can be seen as a corrected,
iterated version of the procedure of optimizing first over the codes and then
learning the hash function. Compared to this, our optimization is guaranteed to
obtain better hash functions while being not much slower, as demonstrated
experimentally in various supervised datasets. In addition, our framework
facilitates the design of optimization algorithms for arbitrary types of loss
and hash functions.Comment: 22 pages, 12 figures; added new experiments and reference
Exploring Auxiliary Context: Discrete Semantic Transfer Hashing for Scalable Image Retrieval
Unsupervised hashing can desirably support scalable content-based image
retrieval (SCBIR) for its appealing advantages of semantic label independence,
memory and search efficiency. However, the learned hash codes are embedded with
limited discriminative semantics due to the intrinsic limitation of image
representation. To address the problem, in this paper, we propose a novel
hashing approach, dubbed as \emph{Discrete Semantic Transfer Hashing} (DSTH).
The key idea is to \emph{directly} augment the semantics of discrete image hash
codes by exploring auxiliary contextual modalities. To this end, a unified
hashing framework is formulated to simultaneously preserve visual similarities
of images and perform semantic transfer from contextual modalities. Further, to
guarantee direct semantic transfer and avoid information loss, we explicitly
impose the discrete constraint, bit--uncorrelation constraint and bit-balance
constraint on hash codes. A novel and effective discrete optimization method
based on augmented Lagrangian multiplier is developed to iteratively solve the
optimization problem. The whole learning process has linear computation
complexity and desirable scalability. Experiments on three benchmark datasets
demonstrate the superiority of DSTH compared with several state-of-the-art
approaches
Pairwise Supervised Hashing with Bernoulli Variational Auto-Encoder and Self-Control Gradient Estimator
Semantic hashing has become a crucial component of fast similarity search in
many large-scale information retrieval systems, in particular, for text data.
Variational auto-encoders (VAEs) with binary latent variables as hashing codes
provide state-of-the-art performance in terms of precision for document
retrieval. We propose a pairwise loss function with discrete latent VAE to
reward within-class similarity and between-class dissimilarity for supervised
hashing. Instead of solving the optimization relying on existing biased
gradient estimators, an unbiased low-variance gradient estimator is adopted to
optimize the hashing function by evaluating the non-differentiable loss
function over two correlated sets of binary hashing codes to control the
variance of gradient estimates. This new semantic hashing framework achieves
superior performance compared to the state-of-the-arts, as demonstrated by our
comprehensive experiments.Comment: To appear in UAI 202
Learning Structured Ordinal Measures for Video based Face Recognition
This paper presents a structured ordinal measure method for video-based face
recognition that simultaneously learns ordinal filters and structured ordinal
features. The problem is posed as a non-convex integer program problem that
includes two parts. The first part learns stable ordinal filters to project
video data into a large-margin ordinal space. The second seeks self-correcting
and discrete codes by balancing the projected data and a rank-one ordinal
matrix in a structured low-rank way. Unsupervised and supervised structures are
considered for the ordinal matrix. In addition, as a complement to hierarchical
structures, deep feature representations are integrated into our method to
enhance coding stability. An alternating minimization method is employed to
handle the discrete and low-rank constraints, yielding high-quality codes that
capture prior structures well. Experimental results on three commonly used face
video databases show that our method with a simple voting classifier can
achieve state-of-the-art recognition rates using fewer features and samples
- …