45,312 research outputs found
Providing Diversity in K-Nearest Neighbor Query Results
Given a point query Q in multi-dimensional space, K-Nearest Neighbor (KNN)
queries return the K closest answers according to given distance metric in the
database with respect to Q. In this scenario, it is possible that a majority of
the answers may be very similar to some other, especially when the data has
clusters. For a variety of applications, such homogeneous result sets may not
add value to the user. In this paper, we consider the problem of providing
diversity in the results of KNN queries, that is, to produce the closest result
set such that each answer is sufficiently different from the rest. We first
propose a user-tunable definition of diversity, and then present an algorithm,
called MOTLEY, for producing a diverse result set as per this definition.
Through a detailed experimental evaluation on real and synthetic data, we show
that MOTLEY can produce diverse result sets by reading only a small fraction of
the tuples in the database. Further, it imposes no additional overhead on the
evaluation of traditional KNN queries, thereby providing a seamless interface
between diversity and distance.Comment: 20 pages, 11 figure
Geo-Social Group Queries with Minimum Acquaintance Constraint
The prosperity of location-based social networking services enables
geo-social group queries for group-based activity planning and marketing. This
paper proposes a new family of geo-social group queries with minimum
acquaintance constraint (GSGQs), which are more appealing than existing
geo-social group queries in terms of producing a cohesive group that guarantees
the worst-case acquaintance level. GSGQs, also specified with various spatial
constraints, are more complex than conventional spatial queries; particularly,
those with a strict NN spatial constraint are proved to be NP-hard. For
efficient processing of general GSGQ queries on large location-based social
networks, we devise two social-aware index structures, namely SaR-tree and
SaR*-tree. The latter features a novel clustering technique that considers both
spatial and social factors. Based on SaR-tree and SaR*-tree, efficient
algorithms are developed to process various GSGQs. Extensive experiments on
real-world Gowalla and Dianping datasets show that our proposed methods
substantially outperform the baseline algorithms based on R-tree.Comment: This is the preprint version that is accepted by the Very Large Data
Bases Journa
- …