Search CORE

15 research outputs found

Evaluating tradeoff between recall and perfomance of GPU permutation index

Author: Barrionuevo Mercedes
Lopresti Mariela
Miranda Natalia Carolina
Piccoli María Fabiana
Reyes Nora Susana
Publication venue
Publication date: 04/12/2013
Field of study

Query-by-content, by means of similarity search, is a fundamental operation for applications that deal with multimedia data. For this kind of query it is meaningless to look for elements exactly equal to a given one as query. Instead, we need to measure the dissimilarity between the query object and each database object. This search problem can be formalized with the concept of metric space. In this scenario, the search efficiency is understood as minimizing the number of distance calculations required to answer them. Building an index can be a solution, but with very large metric databases is not enough, it is also necessary to speed up the queries by using high performance computing, as GPU, and in some cases is reasonable to accept a fast answer although it was inexact. In this work we evaluate the tradeoff between the answer quality and time performance of our implementation of Permutation Index, on a pure GPU architecture, used to solve in parallel multiple approximate similarity searches on metric databases.WPDP- XIII Workshop procesamiento distribuido y paraleloRed de Universidades con Carreras en Informática (RedUNCI

Servicio de Difusión de la Creación Intelectual

Efficient similarity search on multimedia databases

Author: Lopresti Mariela
Miranda Natalia Carolina
Piccoli María Fabiana
Reyes Nora Susana
Publication venue
Publication date: 01/10/2012
Field of study

Manipulating and retrieving multimedia data has received increasing attention with the advent of cloud storage facilities. The ability of querying by similarity over large data collections is mandatory to improve storage and user interfaces. But, all of them are expensive operations to solve only in CPU; thus, it is convenient to take into account High Performance Computing (HPC) techniques in their solutions. The Graphics Processing Unit (GPU) as an alternative HPC device has been increasingly used to speedup certain computing processes. This work introduces a pure GPU architecture to build the Permutation Index and to solve approximate similarity queries on multimedia databases. The empirical results of each implementation have achieved different level of speedup which are related with characteristics of GPU and the particular database used.Eje: Workshop Bases de datos y minería de datos (WBDDM)Red de Universidades con Carreras en Informática (RedUNCI

Efficient similarity search on multimedia databases

Author: Lopresti Mariela
Miranda Natalia Carolina
Piccoli María Fabiana
Reyes Nora Susana
Publication venue
Publication date: 01/10/2012
Field of study

Servicio de Difusión de la Creación Intelectual

Approximate reverse k-nearest neighbor queries in general metric spaces

Author
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2006
Field of study

Crossref

Group Reverse Nearest Neighbor Search using Modified Skip Graph

Author: Upinder Kaur, Dr. Pushpa Rani Suri
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 31/12/2016
Field of study

The reverse nearest neighbor search is used for spatial queries. The reverse nearest neighbor search, the object in high dimensional space has a certain region where all objects inside the region will think of query object as their nearest neighbor. The existing methods for reverse nearest neighbor search are limited to the single query point, which is inefficient for the high dimensional spatial databases etc. Therefore, in this paper we proposed a group reverse nearest neighbor search which can find multiple query objects in a specific region. In this paper we proposed method for group reverse nearest neighbor queries using modified skip graph

International Journal on Recent and Innovation Trends in Computing and Communication

New Variations of the Maximum Coverage Facility Location Problem

Author: Bhattacharya Bhaswar B
Nandy Subhas C
Publication venue: ScholarlyCommons
Publication date: 01/02/2013
Field of study

Consider a competitive facility location scenario where, given a set U of n users and a set F of m facilities in the plane, the objective is to place a new facility in an appropriate place such that the number of users served by the new facility is maximized. Here users and facilities are considered as points in the plane, and each user takes service from its nearest facility, where the distance between a pair of points is measured in either L1 or L2 or L∞ metric. This problem is also known as the maximum coverage (MaxCov) problem. In this paper, we will consider the k-MaxCov problem, where the objective is to place k (⩾1) new facilities such that the total number of users served by these k new facilities is maximized. We begin by proposing an O(nlogn) time algorithm for the k-MaxCov problem, when the existing facilities are all located on a single straight line and the new facilities are also restricted to lie on the same line. We then study the 2-MaxCov problem in the plane, and propose an O(n2) time and space algorithm in the L1 and L∞ metrics. In the L2 metric, we solve the 2-MaxCov problem in the plane in O(n3logn) time and O(n2logn) space. Finally, we consider the 2-Farthest-MaxCov problem, where a user is served by its farthest facility, and propose an algorithm that runs in O(nlogn) time, in all the three metrics

ScholarlyCommons@Penn

SAH: Shifting-aware Asymmetric Hashing for Reverse $k$ -Maximum Inner Product Search

Author: Huang Qiang
Tung Anthony K. H.
Wang Yanhao
Publication venue: 'Association for the Advancement of Artificial Intelligence (AAAI)'
Publication date: 23/11/2022
Field of study

This paper investigates a new yet challenging problem called Reverse

k

-Maximum Inner Product Search (R

k

MIPS). Given a query (item) vector, a set of item vectors, and a set of user vectors, the problem of R

k

MIPS aims to find a set of user vectors whose inner products with the query vector are one of the

k

largest among the query and item vectors. We propose the first subquadratic-time algorithm, i.e., Shifting-aware Asymmetric Hashing (SAH), to tackle the R

k

MIPS problem. To speed up the Maximum Inner Product Search (MIPS) on item vectors, we design a shifting-invariant asymmetric transformation and develop a novel sublinear-time Shifting-Aware Asymmetric Locality Sensitive Hashing (SA-ALSH) scheme. Furthermore, we devise a new blocking strategy based on the Cone-Tree to effectively prune user vectors (in a batch). We prove that SAH achieves a theoretical guarantee for solving the RMIPS problem. Experimental results on five real-world datasets show that SAH runs 4

\sim

\times

faster than the state-of-the-art methods for R

k

MIPS while achieving F1-scores of over 90\%. The code is available at \url{https://github.com/HuangQiang/SAH}.Comment: Accepted by AAAI 202

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Ranked Reverse Nearest Neighbor Search

Author: LEE Ken C. K.
LEE Wang-Chien
ZHENG Baihua
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/07/2008
Field of study

Crossref

Institutional Knowledge at Singapore Management University

Reverse Nearest Neighbors Search in High Dimensions using Locality-Sensitive Hashing

Author: Arthur David
Oudot Steve,
Publication venue: HAL CCSD
Publication date: 01/01/2010
Field of study

We investigate the problem of finding reverse nearest neighbors efficiently. Although provably good solutions exist for this problem in low or fixed dimensions, to this date the methods proposed in high dimensions are mostly heuristic. We introduce a method that is both provably correct and efficient in all dimensions, based on a reduction of the problem to one instance of \e-nearest neighbor search plus a controlled number of instances of {\em exhaustive

r

-\pleb}, a variant of {\em Point Location among Equal Balls} where all the

r

-balls centered at the data points that contain the query point are sought for, not just one. The former problem has been extensively studied and elegantly solved in high dimensions using Locality-Sensitive Hashing (LSH) techniques. By contrast, the latter problem has a complexity that is still not fully understood. We revisit the analysis of the LSH scheme for exhaustive

r

-\pleb using a somewhat refined notion of locality-sensitive family of hash function, which brings out a meaningful output-sensitive term in the complexity of the problem. Our analysis, combined with a non-isometric lifting of the data, enables us to answer exhaustive

r

-\pleb queries (and down the road reverse nearest neighbors queries) efficiently. Along the way, we obtain a simple algorithm for answering exact nearest neighbor queries, whose complexity is parametrized by some {\em condition number} measuring the inherent difficulty of a given instance of the problem.Nous étudions le problème de la recherche efficace de plus proches voisins inverses en grandes dimensions. Étant donné un nuage de points

P

et un paramètre \e, notre objectif est de pré-traiter le nuage

P

de telle sorte à pouvoir trouver rapidement l'ensemble des plus proches voisins inverses d'un point de requête

q

quelconque, plus éventuellement un petit nombre de faux positifs qui sont proches d'être des plus proches voisins inverses de

q

. Alors que des solutions efficaces et prouvées existent pour ce problème en dimensions petites ou fixées, à ce jour les méthodes proposées en grandes dimensions sont essentiellement heuristiques. Nous proposons une méthode à la fois efficace et prouvée en toutes dimensions, basée sur une réduction du problème à un petit nombre d'instances des problèmes classiques de recherche de plus proche voisin approché et de recherche exhaustive de voisins à distance

r

fixée. La complexité intrinsèque de ce dernier problème reste peu connue. Nous proposons une nouvelle analyse du comportement de certaines techniques de hachage sensibles à la localisation (LSH) sur ce problème, qui met en évidence une borne dépendant de la taille de la sortie, et qui, combinée à un relèvement non-isométrique des points en dimension plus grande, permet de résoudre le problème de la recherche de plus proches voisins inverses efficacement, via la réduction citée précédemment. Dans la foulée nous proposons également une méthode pour effectuer des recherches de plus proches voisins exacts, dont la complexité est paramétrée par un indice de {\em conditionnement} mesurant la difficulté intrinsèque d'une instance particulière du problème

arXiv.org e-Print Archive

CiteSeerX

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

R-Forest for Approximate Nearest Neighbor Queries in High Dimensional Space

Author: Nolen Michael Charles
Publication venue: University of Memphis Digital Commons
Publication date: 24/07/2014
Field of study

Searching high dimensional space has been a challenge and an area of intense research for many years. The dimensionality curse has rendered most existing index methods all but useless causing people to research other techniques. In my dissertation I will try to resurrect one of the best known index structures, R-Tree, which most have given up on as a viable method of answering high dimensional queries. I have pointed out the various advantages of R-Tree as a method for answering approximate nearest neighbor queries, and the advantages of locality sensitive hashing and locality sensitive B-Tree, which are the most successful methods today. I started by looking at improving the maintenance of R-Tree by the use of bulk loading and insertion. I proposed and implemented a new method that bulk loads the index which was an improvement of standard method. I then turned my attention to nearest neighbor queries, which is a much more challenging problem especially in high dimensional space. Initially I developed a set of heuristics, easily implemented in R-Tree, which improved the efficiency of high dimensional approximate nearest neighbor queries. To further refine my method I took another approach, by developing a new model, known as R-Forest, which takes advantage of space partitioning while still using R-Tree as its index structure. With this new approach I was able to implement new heuristics and can show that R-Forest, comprised of a set of R-Trees, is a viable solution tohigh dimensional approximate nearest neighbor queries when compared to established methods

University of Memphis Digital Commons