5,186 research outputs found
Adaptive Processing of Spatial-Keyword Data Over a Distributed Streaming Cluster
The widespread use of GPS-enabled smartphones along with the popularity of
micro-blogging and social networking applications, e.g., Twitter and Facebook,
has resulted in the generation of huge streams of geo-tagged textual data. Many
applications require real-time processing of these streams. For example,
location-based e-coupon and ad-targeting systems enable advertisers to register
millions of ads to millions of users. The number of users is typically very
high and they are continuously moving, and the ads change frequently as well.
Hence sending the right ad to the matching users is very challenging. Existing
streaming systems are either centralized or are not spatial-keyword aware, and
cannot efficiently support the processing of rapidly arriving spatial-keyword
data streams. This paper presents Tornado, a distributed spatial-keyword stream
processing system. Tornado features routing units to fairly distribute the
workload, and furthermore, co-locate the data objects and the corresponding
queries at the same processing units. The routing units use the Augmented-Grid,
a novel structure that is equipped with an efficient search algorithm for
distributing the data objects and queries. Tornado uses evaluators to process
the data objects against the queries. The routing units minimize the redundant
communication by not sending data updates for processing when these updates do
not match any query. By applying dynamically evaluated cost formulae that
continuously represent the processing overhead at each evaluator, Tornado is
adaptive to changes in the workload. Extensive experimental evaluation using
spatio-textual range queries over real Twitter data indicates that Tornado
outperforms the non-spatio-textually aware approaches by up to two orders of
magnitude in terms of the overall system throughput
A Density-Based Approach to the Retrieval of Top-K Spatial Textual Clusters
Keyword-based web queries with local intent retrieve web content that is
relevant to supplied keywords and that represent points of interest that are
near the query location. Two broad categories of such queries exist. The first
encompasses queries that retrieve single spatial web objects that each satisfy
the query arguments. Most proposals belong to this category. The second
category, to which this paper's proposal belongs, encompasses queries that
support exploratory user behavior and retrieve sets of objects that represent
regions of space that may be of interest to the user. Specifically, the paper
proposes a new type of query, namely the top-k spatial textual clusters (k-STC)
query that returns the top-k clusters that (i) are located the closest to a
given query location, (ii) contain the most relevant objects with regard to
given query keywords, and (iii) have an object density that exceeds a given
threshold. To compute this query, we propose a basic algorithm that relies on
on-line density-based clustering and exploits an early stop condition. To
improve the response time, we design an advanced approach that includes three
techniques: (i) an object skipping rule, (ii) spatially gridded posting lists,
and (iii) a fast range query algorithm. An empirical study on real data
demonstrates that the paper's proposals offer scalability and are capable of
excellent performance
Efficient Spatial Keyword Search in Trajectory Databases
An increasing amount of trajectory data is being annotated with text
descriptions to better capture the semantics associated with locations. The
fusion of spatial locations and text descriptions in trajectories engenders a
new type of top- queries that take into account both aspects. Each
trajectory in consideration consists of a sequence of geo-spatial locations
associated with text descriptions. Given a user location and a
keyword set , a top- query returns trajectories whose text
descriptions cover the keywords and that have the shortest match
distance. To the best of our knowledge, previous research on querying
trajectory databases has focused on trajectory data without any text
description, and no existing work has studied such kind of top- queries on
trajectories. This paper proposes one novel method for efficiently computing
top- trajectories. The method is developed based on a new hybrid index,
cell-keyword conscious B-tree, denoted by \cellbtree, which enables us to
exploit both text relevance and location proximity to facilitate efficient and
effective query processing. The results of our extensive empirical studies with
an implementation of the proposed algorithms on BerkeleyDB demonstrate that our
proposed methods are capable of achieving excellent performance and good
scalability.Comment: 12 page
Geo-Social Group Queries with Minimum Acquaintance Constraint
The prosperity of location-based social networking services enables
geo-social group queries for group-based activity planning and marketing. This
paper proposes a new family of geo-social group queries with minimum
acquaintance constraint (GSGQs), which are more appealing than existing
geo-social group queries in terms of producing a cohesive group that guarantees
the worst-case acquaintance level. GSGQs, also specified with various spatial
constraints, are more complex than conventional spatial queries; particularly,
those with a strict NN spatial constraint are proved to be NP-hard. For
efficient processing of general GSGQ queries on large location-based social
networks, we devise two social-aware index structures, namely SaR-tree and
SaR*-tree. The latter features a novel clustering technique that considers both
spatial and social factors. Based on SaR-tree and SaR*-tree, efficient
algorithms are developed to process various GSGQs. Extensive experiments on
real-world Gowalla and Dianping datasets show that our proposed methods
substantially outperform the baseline algorithms based on R-tree.Comment: This is the preprint version that is accepted by the Very Large Data
Bases Journa
Authentication of Moving Top-k Spatial Keyword Queries
published_or_final_versio
Location- and keyword-based querying of geo-textual data: a survey
With the broad adoption of mobile devices, notably smartphones, keyword-based search for content has seen increasing use by mobile users, who are often interested in content related to their geographical location. We have also witnessed a proliferation of geo-textual content that encompasses both textual and geographical information. Examples include geo-tagged microblog posts, yellow pages, and web pages related to entities with physical locations. Over the past decade, substantial research has been conducted on integrating location into keyword-based querying of geo-textual content in settings where the underlying data is assumed to be either relatively static or is assumed to stream into a system that maintains a set of continuous queries. This paper offers a survey of both the research problems studied and the solutions proposed in these two settings. As such, it aims to offer the reader a first understanding of key concepts and techniques, and it serves as an “index” for researchers who are interested in exploring the concepts and techniques underlying proposed solutions to the querying of geo-textual data.Agency for Science, Technology and Research (A*STAR)Ministry of Education (MOE)Nanyang Technological UniversityThis research was supported in part by MOE Tier-2 Grant MOE2019-T2-2-181, MOE Tier-1 Grant RG114/19, an NTU ACE Grant, and the Singtel Cognitive and Artificial Intelligence Lab for Enterprises (SCALE@NTU), which is a collaboration between Singapore Telecommunications Limited (Singtel) and Nanyang Technological University (NTU) that is funded by the Singapore Government through the Industry Alignment Fund Industry Collaboration Projects Grant, and by the Innovation Fund Denmark centre, DIREC
Continuous Spatial Query Processing:A Survey of Safe Region Based Techniques
In the past decade, positioning system-enabled devices such as smartphones have become most prevalent. This functionality brings the increasing popularity of
location-based services
in business as well as daily applications such as navigation, targeted advertising, and location-based social networking.
Continuous spatial queries
serve as a building block for location-based services. As an example, an Uber driver may want to be kept aware of the nearest customers or service stations. Continuous spatial queries require updates to the query result as the query or data objects are moving. This poses challenges to the query efficiency, which is crucial to the user experience of a service. A large number of approaches address this efficiency issue using the concept of
safe region
. A safe region is a region within which arbitrary movement of an object leaves the query result unchanged. Such a region helps reduce the frequency of query result update and hence improves query efficiency. As a result, safe region-based approaches have been popular for processing various types of continuous spatial queries. Safe regions have interesting theoretical properties and are worth in-depth analysis. We provide a comparative study of safe region-based approaches. We describe how safe regions are computed for different types of continuous spatial queries, showing how they improve query efficiency. We compare the different safe region-based approaches and discuss possible further improvements
- …