1,597 research outputs found
Bilinear Random Projections for Locality-Sensitive Binary Codes
Locality-sensitive hashing (LSH) is a popular data-independent indexing
method for approximate similarity search, where random projections followed by
quantization hash the points from the database so as to ensure that the
probability of collision is much higher for objects that are close to each
other than for those that are far apart. Most of high-dimensional visual
descriptors for images exhibit a natural matrix structure. When visual
descriptors are represented by high-dimensional feature vectors and long binary
codes are assigned, a random projection matrix requires expensive complexities
in both space and time. In this paper we analyze a bilinear random projection
method where feature matrices are transformed to binary codes by two smaller
random projection matrices. We base our theoretical analysis on extending
Raginsky and Lazebnik's result where random Fourier features are composed with
random binary quantizers to form locality sensitive binary codes. To this end,
we answer the following two questions: (1) whether a bilinear random projection
also yields similarity-preserving binary codes; (2) whether a bilinear random
projection yields performance gain or loss, compared to a large linear
projection. Regarding the first question, we present upper and lower bounds on
the expected Hamming distance between binary codes produced by bilinear random
projections. In regards to the second question, we analyze the upper and lower
bounds on covariance between two bits of binary codes, showing that the
correlation between two bits is small. Numerical experiments on MNIST and
Flickr45K datasets confirm the validity of our method.Comment: 11 pages, 23 figures, CVPR-201
Hashing for Similarity Search: A Survey
Similarity search (nearest neighbor search) is a problem of pursuing the data
items whose distances to a query item are the smallest from a large database.
Various methods have been developed to address this problem, and recently a lot
of efforts have been devoted to approximate search. In this paper, we present a
survey on one of the main solutions, hashing, which has been widely studied
since the pioneering work locality sensitive hashing. We divide the hashing
algorithms two main categories: locality sensitive hashing, which designs hash
functions without exploring the data distribution and learning to hash, which
learns hash functions according the data distribution, and review them from
various aspects, including hash function design and distance measure and search
scheme in the hash coding space
Projection Bank: From High-Dimensional Data to Medium-Length Binary Codes
Recently, very high-dimensional feature representations, e.g., Fisher Vector, have achieved excellent performance for visual recognition and retrieval. However, these lengthy representations always cause extremely heavy computational and storage costs and even become unfeasible in some large-scale applications. A few existing techniques can transfer very high-dimensional data into binary codes, but they still require the reduced code length to be relatively long to maintain acceptable accuracies. To target a better balance between computational efficiency and accuracies, in this paper, we propose a novel embedding method called Binary Projection Bank (BPB), which can effectively reduce the very high-dimensional representations to medium-dimensional binary codes without sacrificing accuracies. Instead of using conventional single linear or bilinear projections, the proposed method learns a bank of small projections via the max-margin constraint to optimally preserve the intrinsic data similarity. We have systematically evaluated the proposed method on three datasets: Flickr 1M, ILSVR2010 and UCF101, showing competitive retrieval and recognition accuracies compared with state-of-the-art approaches, but with a significantly smaller memory footprint and lower coding complexity
- …