Search CORE

12,267 research outputs found

Scalable Image Retrieval by Sparse Product Quantization

Author: Chen Chun
Hoi Steven C. H.
Ning Qingqun
Zhong Zhiyuan
Zhu Jianke
Publication venue
Publication date: 15/03/2016
Field of study

Fast Approximate Nearest Neighbor (ANN) search technique for high-dimensional feature indexing and retrieval is the crux of large-scale image retrieval. A recent promising technique is Product Quantization, which attempts to index high-dimensional image features by decomposing the feature space into a Cartesian product of low dimensional subspaces and quantizing each of them separately. Despite the promising results reported, their quantization approach follows the typical hard assignment of traditional quantization methods, which may result in large quantization errors and thus inferior search performance. Unlike the existing approaches, in this paper, we propose a novel approach called Sparse Product Quantization (SPQ) to encoding the high-dimensional feature vectors into sparse representation. We optimize the sparse representations of the feature vectors by minimizing their quantization errors, making the resulting representation is essentially close to the original data in practice. Experiments show that the proposed SPQ technique is not only able to compress data, but also an effective encoding technique. We obtain state-of-the-art results for ANN search on four public image datasets and the promising results of content-based image retrieval further validate the efficacy of our proposed method.Comment: 12 page

arXiv.org e-Print Archive

Institutional Knowledge at Singapore Management University

Entropy-scaling search of massive biological data

Author: Berger Bonnie
Daniels Noah M.
Danko David Christian
Yu Y. William
Publication venue: 'Elsevier BV'
Publication date: 01/06/2015
Field of study

Many datasets exhibit a well-defined structure that can be exploited to design faster search tools, but it is not always clear when such acceleration is possible. Here, we introduce a framework for similarity search based on characterizing a dataset's entropy and fractal dimension. We prove that searching scales in time with metric entropy (number of covering hyperspheres), if the fractal dimension of the dataset is low, and scales in space with the sum of metric entropy and information-theoretic entropy (randomness of the data). Using these ideas, we present accelerated versions of standard tools, with no loss in specificity and little loss in sensitivity, for use in three domains---high-throughput drug screening (Ammolite, 150x speedup), metagenomics (MICA, 3.5x speedup of DIAMOND [3,700x BLASTX]), and protein structure search (esFragBag, 10x speedup of FragBag). Our framework can be used to achieve "compressive omics," and the general theory can be readily applied to data science problems outside of biology.Comment: Including supplement: 41 pages, 6 figures, 4 tables, 1 bo

arXiv.org e-Print Archive

Elsevier - Publisher Connector

DSpace@MIT

PubMed Central

Postprocessing can speed up general quantum search algorithms

Author: Tulsi Avatar
Publication venue: 'American Physical Society (APS)'
Publication date: 01/01/2015
Field of study

A general quantum search algorithm aims to evolve a quantum system from a known source state

|s\rangle

to an unknown target state

|t\rangle

. It uses a diffusion operator

D_{s}

having source state as one of its eigenstates and

I_{t}

, where

I_{\psi}

denotes the selective phase inversion of

|\psi\rangle

state. It evolves

|s\rangle

to a particular state

|w\rangle

, call it w-state, in

O(B/\alpha)

time steps where

\alpha

|\langle t|s\rangle|

and

B

is a characteristic of the diffusion operator. Measuring the w-state gives the target state with the success probability of

O(1/B^{2})

and

O(B^{2})

applications of the algorithm can boost it from

O(1/B^{2})

O(1)

, making the total time complexity

O(B^{3}/\alpha)

. In the special case of Grover's algorithm,

D_{s}

I_{s}

and

B

is very close to

1

. A more efficient way to boost the success probability is quantum amplitude amplification provided we can efficiently implement

I_{w}

. Such an efficient implementation is not known so far. In this paper, we present an efficient algorithm to approximate selective phase inversions of the unknown eigenstates of an operator using phase estimation algorithm. This algorithm is used to efficiently approximate

I_{w}

which reduces the time complexity of general algorithm to

O(B/\alpha)

. Though

O(B/\alpha)

algorithms are known to exist, our algorithm offers physical implementation advantages.Comment: Accepted for publication in Physical Review A. arXiv admin note: substantial text overlap with arXiv:1210.464

arXiv.org e-Print Archive

Dspace at IIT Bombay

Revisiting Kernelized Locality-Sensitive Hashing for Improved Large-Scale Image Retrieval

Author: Jiang Ke
Kulis Brian
Que Qichao
Publication venue
Publication date: 15/11/2014
Field of study

We present a simple but powerful reinterpretation of kernelized locality-sensitive hashing (KLSH), a general and popular method developed in the vision community for performing approximate nearest-neighbor searches in an arbitrary reproducing kernel Hilbert space (RKHS). Our new perspective is based on viewing the steps of the KLSH algorithm in an appropriately projected space, and has several key theoretical and practical benefits. First, it eliminates the problematic conceptual difficulties that are present in the existing motivation of KLSH. Second, it yields the first formal retrieval performance bounds for KLSH. Third, our analysis reveals two techniques for boosting the empirical performance of KLSH. We evaluate these extensions on several large-scale benchmark image retrieval data sets, and show that our analysis leads to improved recall performance of at least 12%, and sometimes much higher, over the standard KLSH method.Comment: 15 page

arXiv.org e-Print Archive

CiteSeerX

Crossref