Search CORE

70,198 research outputs found

Fast Exact Search in Hamming Space with Multi-Index Hashing

Author: Fleet David J.
Norouzi Mohammad
Punjani Ali
Publication venue
Publication date: 24/04/2014
Field of study

There is growing interest in representing image data and feature descriptors using compact binary codes for fast near neighbor search. Although binary codes are motivated by their use as direct indices (addresses) into a hash table, codes longer than 32 bits are not being used as such, as it was thought to be ineffective. We introduce a rigorous way to build multiple hash tables on binary code substrings that enables exact k-nearest neighbor search in Hamming space. The approach is storage efficient and straightforward to implement. Theoretical analysis shows that the algorithm exhibits sub-linear run-time behavior for uniformly distributed codes. Empirical results show dramatic speedups over a linear scan baseline for datasets of up to one billion codes of 64, 128, or 256 bits

arXiv.org e-Print Archive

CiteSeerX

A Framework for Fast Classification Algorithms

Author: Chandra Jain Ramesh
Ghanshyam Thakur
Publication venue: Institute of Information Theories and Applications FOI ITHEA
Publication date: 01/01/2008
Field of study

Today, due to globalization of the world the size of data set is increasing, it is necessary to discover the knowledge. The discovery of knowledge can be typically in the form of association rules, classification rules, clustering, discovery of frequent episodes and deviation detection. Fast and accurate classifiers for large databases are an important task in data mining. There is growing evidence that integrating classification and association rules mining, classification approaches based on heuristic, greedy search like decision tree induction. Emerging associative classification algorithms have shown good promises on producing accurate classifiers. In this paper we focus on performance of associative classification and present a parallel model for classifier building. For classifier building some parallel-distributed algorithms have been proposed for decision tree induction but so far no such work has been reported for associative classification

Bulgarian Digital Mathematics Library at IMI-BAS

A walk through the web’s video clips

Author: Perona Pietro
Zanetti Sara
Zelnik-Manor Lihi
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2008
Field of study

Approximately 10^5 video clips are posted every day on the Web. The popularity of Web-based video databases poses a number of challenges to machine vision scientists: how do we organize, index and search such large wealth of data? Content-based video search and classification have been proposed in the literature and applied successfully to analyzing movies, TV broadcasts and lab-made videos. We explore the performance of some of these algorithms on a large data-set of approximately 3000 videos. We collected our data-set directly from the Web minimizing bias for content or quality, way so as to have a faithful representation of the statistics of this medium. We find that the algorithms that we have come to trust do not work well on video clips, because their quality is lower and their subject is more varied. We will make the data publicly available to encourage further research

CiteSeerX

Caltech Authors

Fast Color Quantization Using Weighted Sort-Means Clustering

Author: Balasubramanian
Bing
Chang
Cheng
Dekker
Deng
Deng
Drineas
Equitz
Forgy
Gentile
Heckbert
Hu
Hu
Huang
Joy
Kanjanawanishkul
Kanungo
Kasuga
Kolen
Kuo
Linde
Lloyd
M. Emre Celebi
Orchard
Ozdemir
Papamarkos
Schaefer
Scheunders
Sirisathitkul
Wan
Xiang
Xiang
Yang
Yang
Publication venue: 'The Optical Society'
Publication date: 01/01/2009
Field of study

Color quantization is an important operation with numerous applications in graphics and image processing. Most quantization methods are essentially based on data clustering algorithms. However, despite its popularity as a general purpose clustering algorithm, k-means has not received much respect in the color quantization literature because of its high computational requirements and sensitivity to initialization. In this paper, a fast color quantization method based on k-means is presented. The method involves several modifications to the conventional (batch) k-means algorithm including data reduction, sample weighting, and the use of triangle inequality to speed up the nearest neighbor search. Experiments on a diverse set of images demonstrate that, with the proposed modifications, k-means becomes very competitive with state-of-the-art color quantization methods in terms of both effectiveness and efficiency.Comment: 30 pages, 2 figures, 4 table

arXiv.org e-Print Archive

CiteSeerX

Crossref