3,154 research outputs found
Hashing for Similarity Search: A Survey
Similarity search (nearest neighbor search) is a problem of pursuing the data
items whose distances to a query item are the smallest from a large database.
Various methods have been developed to address this problem, and recently a lot
of efforts have been devoted to approximate search. In this paper, we present a
survey on one of the main solutions, hashing, which has been widely studied
since the pioneering work locality sensitive hashing. We divide the hashing
algorithms two main categories: locality sensitive hashing, which designs hash
functions without exploring the data distribution and learning to hash, which
learns hash functions according the data distribution, and review them from
various aspects, including hash function design and distance measure and search
scheme in the hash coding space
Revisiting Kernelized Locality-Sensitive Hashing for Improved Large-Scale Image Retrieval
We present a simple but powerful reinterpretation of kernelized
locality-sensitive hashing (KLSH), a general and popular method developed in
the vision community for performing approximate nearest-neighbor searches in an
arbitrary reproducing kernel Hilbert space (RKHS). Our new perspective is based
on viewing the steps of the KLSH algorithm in an appropriately projected space,
and has several key theoretical and practical benefits. First, it eliminates
the problematic conceptual difficulties that are present in the existing
motivation of KLSH. Second, it yields the first formal retrieval performance
bounds for KLSH. Third, our analysis reveals two techniques for boosting the
empirical performance of KLSH. We evaluate these extensions on several
large-scale benchmark image retrieval data sets, and show that our analysis
leads to improved recall performance of at least 12%, and sometimes much
higher, over the standard KLSH method.Comment: 15 page
- …