1,386 research outputs found
A Survey on Learning to Hash
Nearest neighbor search is a problem of finding the data points from the
database such that the distances from them to the query point are the smallest.
Learning to hash is one of the major solutions to this problem and has been
widely studied recently. In this paper, we present a comprehensive survey of
the learning to hash algorithms, categorize them according to the manners of
preserving the similarities into: pairwise similarity preserving, multiwise
similarity preserving, implicit similarity preserving, as well as quantization,
and discuss their relations. We separate quantization from pairwise similarity
preserving as the objective function is very different though quantization, as
we show, can be derived from preserving the pairwise similarities. In addition,
we present the evaluation protocols, and the general performance analysis, and
point out that the quantization algorithms perform superiorly in terms of
search accuracy, search time cost, and space cost. Finally, we introduce a few
emerging topics.Comment: To appear in IEEE Transactions On Pattern Analysis and Machine
Intelligence (TPAMI
Learning to Hash for Indexing Big Data - A Survey
The explosive growth in big data has attracted much attention in designing
efficient indexing and search methods recently. In many critical applications
such as large-scale search and pattern matching, finding the nearest neighbors
to a query is a fundamental research problem. However, the straightforward
solution using exhaustive comparison is infeasible due to the prohibitive
computational complexity and memory requirement. In response, Approximate
Nearest Neighbor (ANN) search based on hashing techniques has become popular
due to its promising performance in both efficiency and accuracy. Prior
randomized hashing methods, e.g., Locality-Sensitive Hashing (LSH), explore
data-independent hash functions with random projections or permutations.
Although having elegant theoretic guarantees on the search quality in certain
metric spaces, performance of randomized hashing has been shown insufficient in
many real-world applications. As a remedy, new approaches incorporating
data-driven learning methods in development of advanced hash functions have
emerged. Such learning to hash methods exploit information such as data
distributions or class labels when optimizing the hash codes or functions.
Importantly, the learned hash codes are able to preserve the proximity of
neighboring data in the original feature spaces in the hash code spaces. The
goal of this paper is to provide readers with systematic understanding of
insights, pros and cons of the emerging techniques. We provide a comprehensive
survey of the learning to hash framework and representative techniques of
various types, including unsupervised, semi-supervised, and supervised. In
addition, we also summarize recent hashing approaches utilizing the deep
learning models. Finally, we discuss the future direction and trends of
research in this area
Near-Isometric Binary Hashing for Large-scale Datasets
We develop a scalable algorithm to learn binary hash codes for indexing
large-scale datasets. Near-isometric binary hashing (NIBH) is a data-dependent
hashing scheme that quantizes the output of a learned low-dimensional embedding
to obtain a binary hash code. In contrast to conventional hashing schemes,
which typically rely on an -norm (i.e., average distortion)
minimization, NIBH is based on a -norm (i.e., worst-case
distortion) minimization that provides several benefits, including superior
distance, ranking, and near-neighbor preservation performance. We develop a
practical and efficient algorithm for NIBH based on column generation that
scales well to large datasets. A range of experimental evaluations demonstrate
the superiority of NIBH over ten state-of-the-art binary hashing schemes
Binary Coding in Stream
Big data is becoming ever more ubiquitous, ranging over massive video
repositories, document corpuses, image sets and Internet routing history.
Proximity search and clustering are two algorithmic primitives fundamental to
data analysis, but suffer from the "curse of dimensionality" on these gigantic
datasets. A popular attack for this problem is to convert object
representations into short binary codewords, while approximately preserving
near neighbor structure. However, there has been limited research on
constructing codewords in the "streaming" or "online" settings often applicable
to this scale of data, where one may only make a single pass over data too
massive to fit in local memory.
In this paper, we apply recent advances in matrix sketching techniques to
construct binary codewords in both streaming and online setting. Our
experimental results compete outperform several of the most popularly used
algorithms, and we prove theoretical guarantees on performance in the streaming
setting under mild assumptions on the data and randomness of the training set.Comment: 5 figures, 9 page
Semantic Cluster Unary Loss for Efficient Deep Hashing
Hashing method maps similar data to binary hashcodes with smaller hamming
distance, which has received a broad attention due to its low storage cost and
fast retrieval speed. With the rapid development of deep learning, deep hashing
methods have achieved promising results in efficient information retrieval.
Most of the existing deep hashing methods adopt pairwise or triplet losses to
deal with similarities underlying the data, but the training is difficult and
less efficient because data pairs and triplets are involved.
To address these issues, we propose a novel deep hashing algorithm with unary
loss which can be trained very efficiently. We first of all introduce a Unary
Upper Bound of the traditional triplet loss, thus reducing the complexity to
and bridging the classification-based unary loss and the triplet loss.
Second, we propose a novel Semantic Cluster Deep Hashing (SCDH) algorithm by
introducing a modified Unary Upper Bound loss, named Semantic Cluster Unary
Loss (SCUL). The resultant hashcodes form several compact clusters, which means
hashcodes in the same cluster have similar semantic information. We also
demonstrate that the proposed SCDH is easy to be extended to semi-supervised
settings by incorporating the state-of-the-art semi-supervised learning
algorithms. Experiments on large-scale datasets show that the proposed method
is superior to state-of-the-art hashing algorithms.Comment: 13 page
Discriminative Supervised Hashing for Cross-Modal similarity Search
With the advantage of low storage cost and high retrieval efficiency, hashing
techniques have recently been an emerging topic in cross-modal similarity
search. As multiple modal data reflect similar semantic content, many
researches aim at learning unified binary codes. However, discriminative
hashing features learned by these methods are not adequate. This results in
lower accuracy and robustness. We propose a novel hashing learning framework
which jointly performs classifier learning, subspace learning and matrix
factorization to preserve class-specific semantic content, termed
Discriminative Supervised Hashing (DSH), to learn the discrimative unified
binary codes for multi-modal data. Besides, reducing the loss of information
and preserving the non-linear structure of data, DSH non-linearly projects
different modalities into the common space in which the similarity among
heterogeneous data points can be measured. Extensive experiments conducted on
the three publicly available datasets demonstrate that the framework proposed
in this paper outperforms several state-of -the-art methods.Comment: 7 pages,3 figures,4 tables;The paper is under consideration at Image
and Vision Computin
Rank Subspace Learning for Compact Hash Codes
The era of Big Data has spawned unprecedented interests in developing hashing
algorithms for efficient storage and fast nearest neighbor search. Most
existing work learn hash functions that are numeric quantizations of feature
values in projected feature space. In this work, we propose a novel hash
learning framework that encodes feature's rank orders instead of numeric values
in a number of optimal low-dimensional ranking subspaces. We formulate the
ranking subspace learning problem as the optimization of a piece-wise linear
convex-concave function and present two versions of our algorithm: one with
independent optimization of each hash bit and the other exploiting a sequential
learning framework. Our work is a generalization of the Winner-Take-All (WTA)
hash family and naturally enjoys all the numeric stability benefits of rank
correlation measures while being optimized to achieve high precision at very
short code length. We compare with several state-of-the-art hashing algorithms
in both supervised and unsupervised domain, showing superior performance in a
number of data sets.Comment: 10 page
Graph based manifold regularized deep neural networks for automatic speech recognition
Deep neural networks (DNNs) have been successfully applied to a wide variety
of acoustic modeling tasks in recent years. These include the applications of
DNNs either in a discriminative feature extraction or in a hybrid acoustic
modeling scenario. Despite the rapid progress in this area, a number of
challenges remain in training DNNs. This paper presents an effective way of
training DNNs using a manifold learning based regularization framework. In this
framework, the parameters of the network are optimized to preserve underlying
manifold based relationships between speech feature vectors while minimizing a
measure of loss between network outputs and targets. This is achieved by
incorporating manifold based locality constraints in the objective criterion of
DNNs. Empirical evidence is provided to demonstrate that training a network
with manifold constraints preserves structural compactness in the hidden layers
of the network. Manifold regularization is applied to train bottleneck DNNs for
feature extraction in hidden Markov model (HMM) based speech recognition. The
experiments in this work are conducted on the Aurora-2 spoken digits and the
Aurora-4 read news large vocabulary continuous speech recognition tasks. The
performance is measured in terms of word error rate (WER) on these tasks. It is
shown that the manifold regularized DNNs result in up to 37% reduction in WER
relative to standard DNNs.Comment: 12 pages including citations, 2 figure
A Taxonomy of Peer-to-Peer Based Complex Queries: a Grid perspective
Grid superscheduling requires support for efficient and scalable discovery of
resources. Resource discovery activities involve searching for the appropriate
resource types that match the user's job requirements. To accomplish this goal,
a resource discovery system that supports the desired look-up operation is
mandatory. Various kinds of solutions to this problem have been suggested,
including the centralised and hierarchical information server approach.
However, both of these approaches have serious limitations in regards to
scalability, fault-tolerance and network congestion. To overcome these
limitations, organising resource information using Peer-to-Peer (P2P) network
model has been proposed. Existing approaches advocate an extension to
structured P2P protocols, to support the Grid resource information system
(GRIS). In this paper, we identify issues related to the design of such an
efficient, scalable, fault-tolerant, consistent and practical GRIS system using
a P2P network model. We compile these issues into various taxonomies in
sections III and IV. Further, we look into existing works that apply P2P based
network protocols to GRIS. We think that this taxonomy and its mapping to
relevant systems would be useful for academic and industry based researchers
who are engaged in the design of scalable Grid systems
ProjectionNet: Learning Efficient On-Device Deep Networks Using Neural Projections
Deep neural networks have become ubiquitous for applications related to
visual recognition and language understanding tasks. However, it is often
prohibitive to use typical neural networks on devices like mobile phones or
smart watches since the model sizes are huge and cannot fit in the limited
memory available on such devices. While these devices could make use of machine
learning models running on high-performance data centers with CPUs or GPUs,
this is not feasible for many applications because data can be privacy
sensitive and inference needs to be performed directly "on" device.
We introduce a new architecture for training compact neural networks using a
joint optimization framework. At its core lies a novel objective that jointly
trains using two different types of networks--a full trainer neural network
(using existing architectures like Feed-forward NNs or LSTM RNNs) combined with
a simpler "projection" network that leverages random projections to transform
inputs or intermediate representations into bits. The simpler network encodes
lightweight and efficient-to-compute operations in bit space with a low memory
footprint. The two networks are trained jointly using backpropagation, where
the projection network learns from the full network similar to apprenticeship
learning. Once trained, the smaller network can be used directly for inference
at low memory and computation cost. We demonstrate the effectiveness of the new
approach at significantly shrinking the memory requirements of different types
of neural networks while preserving good accuracy on visual recognition and
text classification tasks. We also study the question "how many neural bits are
required to solve a given task?" using the new framework and show empirical
results contrasting model predictive capacity (in bits) versus accuracy on
several datasets
- β¦