11 research outputs found

    Hybrid Hashing Method for Similar Vehicle Image Search

    Get PDF
    The novel hybrid method of a hash image calculation that can be applied in a search for similar vehicle images is proposed in this paper. The main novelty of the method described herein is the combination of two hashing types: the visual and semantic hash of the image. The method is based on SIFT and DCT algorithms. We use frontal vehicle images to test the method accuracy. The experimental results indicate that the proposed algorithm has the practical application of image search in the vehicle identification systems based on license plate recognition. We show that method is a novel in this area. The proposed method is also applicable for use in other problem domains

    Fast Exact Search in Hamming Space with Multi-Index Hashing

    Full text link
    There is growing interest in representing image data and feature descriptors using compact binary codes for fast near neighbor search. Although binary codes are motivated by their use as direct indices (addresses) into a hash table, codes longer than 32 bits are not being used as such, as it was thought to be ineffective. We introduce a rigorous way to build multiple hash tables on binary code substrings that enables exact k-nearest neighbor search in Hamming space. The approach is storage efficient and straightforward to implement. Theoretical analysis shows that the algorithm exhibits sub-linear run-time behavior for uniformly distributed codes. Empirical results show dramatic speedups over a linear scan baseline for datasets of up to one billion codes of 64, 128, or 256 bits

    Mobile product search with bag of hash bits

    Full text link

    Boosting multi-kernel Locality-Sensitive Hashing for scalable image retrieval

    Get PDF
    Ministry of Education, Singapore under its Academic Research Funding Tier

    Hashing for Similarity Search: A Survey

    Full text link
    Similarity search (nearest neighbor search) is a problem of pursuing the data items whose distances to a query item are the smallest from a large database. Various methods have been developed to address this problem, and recently a lot of efforts have been devoted to approximate search. In this paper, we present a survey on one of the main solutions, hashing, which has been widely studied since the pioneering work locality sensitive hashing. We divide the hashing algorithms two main categories: locality sensitive hashing, which designs hash functions without exploring the data distribution and learning to hash, which learns hash functions according the data distribution, and review them from various aspects, including hash function design and distance measure and search scheme in the hash coding space

    Large-scale image retrieval using similarity preserving binary codes

    Get PDF
    Image retrieval is a fundamental problem in computer vision, and has many applications. When the dataset size gets very large, retrieving images in Internet image collections becomes very challenging. The challenges come from storage, computation speed, and similarity representation. My thesis addresses learning compact similarity preserving binary codes, which represent each image by a short binary string, for fast retrieval in large image databases. I will first present an approach called Iterative Quantization to convert high-dimensional vectors to compact binary codes, which works by learning a rotation to minimize the quantization error of mapping data to the vertices of a binary Hamming cube. This approach achieves state-of-the-art accuracy for preserving neighbors in the original feature space, as well as state-of-the-art semantic precision. Second, I will extend this approach to two different scenarios in large-scale recognition and retrieval problems. The first extension is aimed at high-dimensional histogram data, such as bag-of-words features or text documents. Such vectors are typically sparse and nonnegative. I develop an algorithm that explores the special structure of such data by mapping feature vectors to binary vertices in the positive orthant, which gives improved performance. The second extension is for Fisher Vectors, which are dense descriptors having tens of thousands to millions of dimensions. I develop a novel method for converting such descriptors to compact similarity-preserving binary codes that exploits their natural matrix structure to reduce their dimensionality using compact bilinear projections instead of a single large projection matrix. This method achieves retrieval and classification accuracy comparable to that of the original descriptors and to the state-of-the-art Product Quantization approach while having orders of magnitude faster code generation time and smaller memory footprint. Finally, I present two applications of using Internet images and tags/labels to learn binary codes with label supervision, and show improved retrieval accuracy on several large Internet image datasets. First, I will present an application that performs cross-modal retrieval in the Hamming space. Then I will present an application on using supervised binary classeme representations for large-scale image retrieval.Doctor of Philosoph

    Compact hashing with joint optimization of search accuracy and time

    No full text
    Similarity search, namely, finding approximate nearest neighborhoods, is the core of many large scale machine learning or vision applications. Recently, many research results demonstrate that hashing with compact codes can achieve promising performance for large scale similarity search. However, most of the previous hashing methods with compact codes only model and optimize the search accuracy. Search time, which is an important factor for hashing in practice, is usually not addressed explicitly. In this paper, we develop a new scalable hashing algorithm with joint optimization of search accuracy and search time simultaneously. Our method generates compact hash codes for data of general formats with any similarity function. We evaluate our method using diverse data sets up to 1 million samples (e.g., web images). Our comprehensive results show the proposed method significantly outperforms several state-of-the-art hashing approaches. 1. Introduction an