Search CORE

346,922 research outputs found

HDIdx: High-Dimensional Indexing for Efficient Approximate Nearest Neighbor Search

Author: Hoi Steven C. H.
Li Jintao
Tang Sheng
Wan Ji
Wu Pengcheng
Zhang Yongdong
Publication venue: 'Elsevier BV'
Publication date: 07/10/2015
Field of study

Fast Nearest Neighbor (NN) search is a fundamental challenge in large-scale data processing and analytics, particularly for analyzing multimedia contents which are often of high dimensionality. Instead of using exact NN search, extensive research efforts have been focusing on approximate NN search algorithms. In this work, we present "HDIdx", an efficient high-dimensional indexing library for fast approximate NN search, which is open-source and written in Python. It offers a family of state-of-the-art algorithms that convert input high-dimensional vectors into compact binary codes, making them very efficient and scalable for NN search with very low space complexity

arXiv.org e-Print Archive

Institutional Knowledge at Singapore Management University

Off the Beaten Path: Let's Replace Term-Based Retrieval with k-NN Search

Author: Andoni A.
Beyer K.
Broder A. Z.
Brown P. F.
Fried D.
Le Q.
Mikolov T.
Mu Y.
Muja M.
Petrović S.
Riezler S.
Salton G.
Wang J.
Weber R.
Yang L.
Yao X.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 30/10/2016
Field of study

Retrieval pipelines commonly rely on a term-based search to obtain candidate records, which are subsequently re-ranked. Some candidates are missed by this approach, e.g., due to a vocabulary mismatch. We address this issue by replacing the term-based search with a generic k-NN retrieval algorithm, where a similarity function can take into account subtle term associations. While an exact brute-force k-NN search using this similarity function is slow, we demonstrate that an approximate algorithm can be nearly two orders of magnitude faster at the expense of only a small loss in accuracy. A retrieval pipeline using an approximate k-NN search can be more effective and efficient than the term-based pipeline. This opens up new possibilities for designing effective retrieval pipelines. Our software (including data-generating code) and derivative data based on the Stack Overflow collection is available online

arXiv.org e-Print Archive

Crossref

Scipedia

Hybrid LSH: Faster Near Neighbors Reporting in High-dimensional Space

Author: Pham Ninh
Publication venue
Publication date: 01/01/2017
Field of study

We study the

r

-near neighbors reporting problem (

r

-NN), i.e., reporting \emph{all} points in a high-dimensional point set

S

that lie within a radius

r

of a given query point

q

. Our approach builds upon on the locality-sensitive hashing (LSH) framework due to its appealing asymptotic sublinear query time for near neighbor search problems in high-dimensional space. A bottleneck of the traditional LSH scheme for solving

r

-NN is that its performance is sensitive to data and query-dependent parameters. On datasets whose data distributions have diverse local density patterns, LSH with inappropriate tuning parameters can sometimes be outperformed by a simple linear search. In this paper, we introduce a hybrid search strategy between LSH-based search and linear search for

r

-NN in high-dimensional space. By integrating an auxiliary data structure into LSH hash tables, we can efficiently estimate the computational cost of LSH-based search for a given query regardless of the data distribution. This means that we are able to choose the appropriate search strategy between LSH-based search and linear search to achieve better performance. Moreover, the integrated data structure is time efficient and fits well with many recent state-of-the-art LSH-based approaches. Our experiments on real-world datasets show that the hybrid search approach outperforms (or is comparable to) both LSH-based search and linear search for a wide range of search radii and data distributions in high-dimensional space.Comment: Accepted as a short paper in EDBT 201

arXiv.org e-Print Archive

Copenhagen University Research Information System

Optimized Neural Networks to Search for Higgs Boson Production at the Tevatron

Author: Bhat
Boos
D. Smirnov
E. Boos
L. Dudko
Publication venue: 'Elsevier BV'
Publication date: 12/02/2003
Field of study

An optimal choice of proper kinematical variables is one of the main steps in using neural networks (NN) in high energy physics. Our method of the variable selection is based on the analysis of a structure of Feynman diagrams (singularities and spin correlations) contributing to the signal and background processes. An application of this method to the Higgs boson search at the Tevatron leads to an improvement in the NN efficiency by a factor of 1.5-2 in comparison to previous NN studies.Comment: 4 pages, 4 figures, partially presented in proceedings of ACAT'02 conferenc

arXiv.org e-Print Archive

Crossref

CERN Document Server

Analysis of the low-energy $\eta$ NN-dynamics within a three-body formalism

Author: Arenhoevel H.
Fix A.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2000
Field of study

The interaction of an

\eta

-meson with two nucleons is studied within a three-body approach. The major features of the

\eta NN

-system in the low-energy region are accounted for by using a s-wave separable ansatz for the two-body

\eta N

- and

NN

-amplitudes. The calculation is confined to the

(J^\pi;T)=(0^-;1)

and

(1^-;0)

configurations which are assumed to be the most promising candidates for virtual or resonant

\eta NN

-states. The eigenvalue three-body equation is continued analytically into the nonphysical sheets by contour deformation. The position of the poles of the three-body scattering matrix as a function of the

\eta N

-interaction strength is investigated. The corresponding trajectory, starting on the physical sheet, moves around the

\eta NN

three-body threshold and continues away from the physical area giving rise to virtual

\eta NN

-states. The search for poles on the nonphysical sheets adjacent directly to the upper rim of the real energy axis gives a negative result. Thus no low-lying s-wave

\eta NN

-resonances were found. The possible influence of virtual poles on the low-energy

\eta NN

-scattering is discussed.Comment: 16 pages revtex including 10 figure

arXiv.org e-Print Archive

EDP Sciences OAI-PMH repository (1.2.0)

CERN Document Server