22 research outputs found
The Power of Asymmetry in Binary Hashing
When approximating binary similarity using the hamming distance between short
binary hashes, we show that even if the similarity is symmetric, we can have
shorter and more accurate hashes by using two distinct code maps. I.e. by
approximating the similarity between and as the hamming distance
between and , for two distinct binary codes , rather than as
the hamming distance between and .Comment: Accepted to NIPS 2013, 9 pages, 5 figure
Asymmetric Deep Supervised Hashing
Hashing has been widely used for large-scale approximate nearest neighbor
search because of its storage and search efficiency. Recent work has found that
deep supervised hashing can significantly outperform non-deep supervised
hashing in many applications. However, most existing deep supervised hashing
methods adopt a symmetric strategy to learn one deep hash function for both
query points and database (retrieval) points. The training of these symmetric
deep supervised hashing methods is typically time-consuming, which makes them
hard to effectively utilize the supervised information for cases with
large-scale database. In this paper, we propose a novel deep supervised hashing
method, called asymmetric deep supervised hashing (ADSH), for large-scale
nearest neighbor search. ADSH treats the query points and database points in an
asymmetric way. More specifically, ADSH learns a deep hash function only for
query points, while the hash codes for database points are directly learned.
The training of ADSH is much more efficient than that of traditional symmetric
deep supervised hashing methods. Experiments show that ADSH can achieve
state-of-the-art performance in real applications
Fast Exact Search in Hamming Space with Multi-Index Hashing
There is growing interest in representing image data and feature descriptors
using compact binary codes for fast near neighbor search. Although binary codes
are motivated by their use as direct indices (addresses) into a hash table,
codes longer than 32 bits are not being used as such, as it was thought to be
ineffective. We introduce a rigorous way to build multiple hash tables on
binary code substrings that enables exact k-nearest neighbor search in Hamming
space. The approach is storage efficient and straightforward to implement.
Theoretical analysis shows that the algorithm exhibits sub-linear run-time
behavior for uniformly distributed codes. Empirical results show dramatic
speedups over a linear scan baseline for datasets of up to one billion codes of
64, 128, or 256 bits
Nested Invariance Pooling and RBM Hashing for Image Instance Retrieval
The goal of this work is the computation of very compact binary hashes for image instance retrieval. Our approach has two novel contributions. The first one is Nested Invariance Pooling (NIP), a method inspired from i-theory, a mathematical theory for computing group invariant transformations with feed-forward neural networks. NIP is able to produce compact and well-performing descriptors with visual representations extracted from convolutional neural networks. We specifically incorporate scale, translation and rotation invariances but the scheme can be extended to any arbitrary sets of transformations. We also show that using moments of increasing order throughout nesting is important. The NIP descriptors are then hashed to the target code size (32-256 bits) with a Restricted Boltzmann Machine with a novel batch-level reg-ularization scheme specifically designed for the purpose of hashing (RBMH). A thorough empirical evaluation with state-of-the-art shows that the results obtained both with the NIP descriptors and the NIP+RBMH hashes are consistently outstanding across a wide range of datasets
Link and code: Fast indexing with graphs and compact regression codes
Similarity search approaches based on graph walks have recently attained
outstanding speed-accuracy trade-offs, taking aside the memory requirements. In
this paper, we revisit these approaches by considering, additionally, the
memory constraint required to index billions of images on a single server. This
leads us to propose a method based both on graph traversal and compact
representations. We encode the indexed vectors using quantization and exploit
the graph structure to refine the similarity estimation.
In essence, our method takes the best of these two worlds: the search
strategy is based on nested graphs, thereby providing high precision with a
relatively small set of comparisons. At the same time it offers a significant
memory compression. As a result, our approach outperforms the state of the art
on operating points considering 64-128 bytes per vector, as demonstrated by our
results on two billion-scale public benchmarks
Asymmetric Hamming Embedding
International audienceThis paper proposes an asymmetric Hamming Embedding scheme for large scale image search based on local descriptors. The comparison of two descriptors relies on an vector-to-binary code comparison, which limits the quantization error associated with the query compared with the original Hamming Embedding method. The approach is used in combination with an inverted file structure that offers high efficiency, comparable to that of a regular bag-of-features retrieval systems. The comparison is performed on two popular datasets. Our method consistently improves the search quality over the symmetric version. The trade-off between memory usage and precision is evaluated, showing that the method is especially useful for short binary signatures