536 research outputs found
Efficient end-to-end learning for quantizable representations
Embedding representation learning via neural networks is at the core
foundation of modern similarity based search. While much effort has been put in
developing algorithms for learning binary hamming code representations for
search efficiency, this still requires a linear scan of the entire dataset per
each query and trades off the search accuracy through binarization. To this
end, we consider the problem of directly learning a quantizable embedding
representation and the sparse binary hash code end-to-end which can be used to
construct an efficient hash table not only providing significant search
reduction in the number of data but also achieving the state of the art search
accuracy outperforming previous state of the art deep metric learning methods.
We also show that finding the optimal sparse binary hash code in a mini-batch
can be computed exactly in polynomial time by solving a minimum cost flow
problem. Our results on Cifar-100 and on ImageNet datasets show the state of
the art search accuracy in precision@k and NMI metrics while providing up to
98X and 478X search speedup respectively over exhaustive linear search. The
source code is available at
https://github.com/maestrojeong/Deep-Hash-Table-ICML18Comment: Accepted and to appear at ICML 2018. Camera ready versio
Deep Supervised Hashing using Symmetric Relative Entropy
By virtue of their simplicity and efficiency, hashing algorithms have achieved significant success on large-scale approximate nearest neighbor search. Recently, many deep neural network based hashing methods have been proposed to improve the search accuracy by simultaneously learning both the feature representation and the binary hash functions. Most deep hashing methods depend on supervised semantic label information for preserving the distance or similarity between local structures, which unfortunately ignores the global distribution of the learned hash codes. We propose a novel deep supervised hashing method that aims to minimize the information loss generated during the embedding process. Specifically, the information loss is measured by the Jensen-Shannon divergence to ensure that compact hash codes have a similar distribution with those from the original images. Experimental results show that our method outperforms current state-of-the-art approaches on two benchmark datasets
- …