Search CORE

158 research outputs found

Providing Diversity in K-Nearest Neighbor Query Results

Author: Haritsa Jayant R.
Jain Anoop
Sarda Parag
Publication venue
Publication date: 15/10/2003
Field of study

Given a point query Q in multi-dimensional space, K-Nearest Neighbor (KNN) queries return the K closest answers according to given distance metric in the database with respect to Q. In this scenario, it is possible that a majority of the answers may be very similar to some other, especially when the data has clusters. For a variety of applications, such homogeneous result sets may not add value to the user. In this paper, we consider the problem of providing diversity in the results of KNN queries, that is, to produce the closest result set such that each answer is sufficiently different from the rest. We first propose a user-tunable definition of diversity, and then present an algorithm, called MOTLEY, for producing a diverse result set as per this definition. Through a detailed experimental evaluation on real and synthetic data, we show that MOTLEY can produce diverse result sets by reading only a small fraction of the tuples in the database. Further, it imposes no additional overhead on the evaluation of traditional KNN queries, thereby providing a seamless interface between diversity and distance.Comment: 20 pages, 11 figure

arXiv.org e-Print Archive

Distance Range Queries in SpatialHadoop

Author: Corral Liria Antonio Leopoldo
García García Francisco
Iribarne Martínez Luis Fernando
Vassilakopoulos Michael
Publication venue
Publication date: 01/01/2016
Field of study

Efficient processing of Distance Range Queries (DRQs) is of great importance in spatial databases due to the wide area of applications. This type of spatial query is characterized by a distance range over one or two datasets. The most representative and known DRQs are the ε Distance Range Query (εDRQ) and the ε Distance Range Join Query (εDRJQ). Given the increasing volume of spatial data, it is difficult to perform a DRQ on a centralized machine efficiently. Moreover, the εDRJQ is an expensive spatial operation, since it can be considered a combination of the εDR and the spatial join queries. For this reason, this paper addresses the problem of computing DRQs on big spatial datasets in SpatialHadoop, an extension of Hadoop that supports spatial operations efficiently, and proposes new algorithms in SpatialHadoop to perform efficient parallel DRQs on large-scale spatial datasets. We have evaluated the performance of the proposed algorithms in several situations with big synthetic and real-world datasets. The experiments have demonstrated the efficiency and scalability of our proposal

Location Selection Query in Google Maps using Voronoi-based Spatial Skyline (VS2) Algorithm

Author: Angraeni Leni
Annisa Annisa
Publication venue: 'Sunan Gunung Djati State Islamic University of Bandung'
Publication date: 17/06/2021
Field of study

Google Maps is one of the popular location selection systems. One of the popular features of Google Maps is nearby search. For example, someone who wants to find the closest restaurants to his location can use the nearby search feature. This feature only considers one specific location in providing the desired place choice. In a real-world situation, there may be a need to consider more than one location in selecting the desired place. Assume someone would like to choose a hotel close to the conference hall, the museum, beach, and souvenir store. In this situation, nearby search feature in Google Maps may not be able to suggest a list of hotels that are interesting for him based on the distance from each destination places. In this paper, we have successfully developed a web-based application of Google Maps search using Voronoi-based Spatial Skyline (VS2) algorithm to choose some Point Of Interest (POI) from Google Maps as their considered locations to select desired place. We used Google Maps API to provide POI information for our web-based application. The experiment result showed that the execution time increases while the number of considered location increases

Jurnal Online Informatika

Enhancing SpatialHadoop with Closest Pair Queries

Author: Corral Liria Antonio Leopoldo
García García Francisco
Iribarne Martínez Luis Fernando
Manolopoulos Yannis
Vassilakopoulos Michael
Publication venue
Publication date: 01/01/2016
Field of study

Given two datasets P and Q, the K Closest Pair Query (KCPQ) finds the K closest pairs of objects from P ×Q. It is an operation widely adopted by many spatial and GIS applications. As a combination of the K Nearest Neighbor (KNN) and the spatial join queries, KCPQ is an expensive operation. Given the increasing volume of spatial data, it is difficult to perform a KCPQ on a centralized machine efficiently. For this reason, this paper addresses the problem of computing the KCPQ on big spatial datasets in SpatialHadoop, an extension of Hadoop that supports spatial operations efficiently, and proposes a novel algorithm in SpatialHadoop to perform efficient parallel KCPQ on large-scale spatial datasets. We have evaluated the performance of the algorithm in several situations with big synthetic and real-world datasets. The experiments have demonstrated the efficiency and scalability of our proposal

Discovering Attractive Products based on Influence Sets

Author: Arvanitis Anastasios
Deligiannakis Antonios
Publication venue
Publication date: 25/07/2011
Field of study

Skyline queries have been widely used as a practical tool for multi-criteria decision analysis and for applications involving preference queries. For example, in a typical online retail application, skyline queries can help customers select the most interesting, among a pool of available, products. Recently, reverse skyline queries have been proposed, highlighting the manufacturer's perspective, i.e. how to determine the expected buyers of a given product. In this work we develop novel algorithms for two important classes of queries involving customer preferences. We first propose a novel algorithm, termed as RSA, for answering reverse skyline queries. We then introduce a new type of queries, namely the k-Most Attractive Candidates k-MAC query. In this type of queries, given a set of existing product specifications P, a set of customer preferences C and a set of new candidate products Q, the k-MAC query returns the set of k candidate products from Q that jointly maximizes the total number of expected buyers, measured as the cardinality of the union of individual reverse skyline sets (i.e., influence sets). Applying existing approaches to solve this problem would require calculating the reverse skyline set for each candidate, which is prohibitively expensive for large data sets. We, thus, propose a batched algorithm for this problem and compare its performance against a branch-and-bound variant that we devise. Both of these algorithms use in their core variants of our RSA algorithm. Our experimental study using both synthetic and real data sets demonstrates that our proposed algorithms outperform existing, or naive solutions to our studied classes of queries

arXiv.org e-Print Archive

CiteSeerX

Complex preference queries supporting spatial applications for user groups

Author: Kießling W.
Kießling W.
Levandoski J. J.
Roocks P.
Sharifzadeh M.
Zheng B.
Publication venue: 'VLDB Endowment'
Publication date
Field of study

Querying Spatial Data by Dominators in Neighborhood

Author: Lu Hua
Xie Xike
Yiu Man Lung
Publication venue: 'Elsevier BV'
Publication date: 01/09/2018
Field of study

VBN

Efficient Large-scale Distance-Based Join Queries in SpatialHadoop

Author: Corral Liria Antonio Leopoldo
García García Francisco
Iribarne Martínez Luis Fernando
Manolopoulos Yannis
Vassilakopoulos Michael
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Efficient processing of Distance-Based Join Queries (DBJQs) in spatial databases is of paramount importance in many application domains. The most representative and known DBJQs are the K Closest Pairs Query (KCPQ) and the ε Distance Join Query (εDJQ). These types of join queries are characterized by a number of desired pairs (K) or a distance threshold (ε) between the components of the pairs in the final result, over two spatial datasets. Both are expensive operations, since two spatial datasets are combined with additional constraints. Given the increasing volume of spatial data originating from multiple sources and stored in distributed servers, it is not always efficient to perform DBJQs on a centralized server. For this reason, this paper addresses the problem of computing DBJQs on big spatial datasets in SpatialHadoop, an extension of Hadoop that supports efficient processing of spatial queries in a cloud-based setting. We propose novel algorithms, based on plane-sweep, to perform efficient parallel DBJQs on large-scale spatial datasets in Spatial Hadoop. We evaluate the performance of the proposed algorithms in several situations with large real-world as well as synthetic datasets. The experiments demonstrate the efficiency and scalability of our proposed methodologies

Finding causality and responsibility for probabilistic reverse skyline query non-answers

Author: CHENG Gang
GAO Yunjun
LIU Qing
ZHENG Baihua
ZHOU Linlin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/11/2016
Field of study

NSF