204 research outputs found
Location- and keyword-based querying of geo-textual data: a survey
With the broad adoption of mobile devices, notably smartphones, keyword-based search for content has seen increasing use by mobile users, who are often interested in content related to their geographical location. We have also witnessed a proliferation of geo-textual content that encompasses both textual and geographical information. Examples include geo-tagged microblog posts, yellow pages, and web pages related to entities with physical locations. Over the past decade, substantial research has been conducted on integrating location into keyword-based querying of geo-textual content in settings where the underlying data is assumed to be either relatively static or is assumed to stream into a system that maintains a set of continuous queries. This paper offers a survey of both the research problems studied and the solutions proposed in these two settings. As such, it aims to offer the reader a first understanding of key concepts and techniques, and it serves as an “index” for researchers who are interested in exploring the concepts and techniques underlying proposed solutions to the querying of geo-textual data.Agency for Science, Technology and Research (A*STAR)Ministry of Education (MOE)Nanyang Technological UniversityThis research was supported in part by MOE Tier-2 Grant MOE2019-T2-2-181, MOE Tier-1 Grant RG114/19, an NTU ACE Grant, and the Singtel Cognitive and Artificial Intelligence Lab for Enterprises (SCALE@NTU), which is a collaboration between Singapore Telecommunications Limited (Singtel) and Nanyang Technological University (NTU) that is funded by the Singapore Government through the Industry Alignment Fund Industry Collaboration Projects Grant, and by the Innovation Fund Denmark centre, DIREC
Geo-Social Group Queries with Minimum Acquaintance Constraint
The prosperity of location-based social networking services enables
geo-social group queries for group-based activity planning and marketing. This
paper proposes a new family of geo-social group queries with minimum
acquaintance constraint (GSGQs), which are more appealing than existing
geo-social group queries in terms of producing a cohesive group that guarantees
the worst-case acquaintance level. GSGQs, also specified with various spatial
constraints, are more complex than conventional spatial queries; particularly,
those with a strict NN spatial constraint are proved to be NP-hard. For
efficient processing of general GSGQ queries on large location-based social
networks, we devise two social-aware index structures, namely SaR-tree and
SaR*-tree. The latter features a novel clustering technique that considers both
spatial and social factors. Based on SaR-tree and SaR*-tree, efficient
algorithms are developed to process various GSGQs. Extensive experiments on
real-world Gowalla and Dianping datasets show that our proposed methods
substantially outperform the baseline algorithms based on R-tree.Comment: This is the preprint version that is accepted by the Very Large Data
Bases Journa
Multi-Source Spatial Entity Linkage
Besides the traditional cartographic data sources, spatial information can
also be derived from location-based sources. However, even though different
location-based sources refer to the same physical world, each one has only
partial coverage of the spatial entities, describe them with different
attributes, and sometimes provide contradicting information. Hence, we
introduce the spatial entity linkage problem, which finds which pairs of
spatial entities belong to the same physical spatial entity. Our proposed
solution (QuadSky) starts with a time-efficient spatial blocking technique
(QuadFlex), compares pairwise the spatial entities in the same block, ranks the
pairs using Pareto optimality with the SkyRank algorithm, and finally,
classifies the pairs with our novel SkyEx-* family of algorithms that yield
0.85 precision and 0.85 recall for a manually labeled dataset of 1,500 pairs
and 0.87 precision and 0.6 recall for a semi-manually labeled dataset of
777,452 pairs. Moreover, we provide a theoretical guarantee and formalize the
SkyEx-FES algorithm that explores only 27% of the skylines without any loss in
F-measure. Furthermore, our fully unsupervised algorithm SkyEx-D approximates
the optimal result with an F-measure loss of just 0.01. Finally, QuadSky
provides the best trade-off between precision and recall, and the best
F-measure compared to the existing baselines and clustering techniques, and
approximates the results of supervised learning solutions
- …