1,461 research outputs found
Reverse spatial visual top-k query
With the wide application of mobile Internet techniques an location-based services (LBS), massive multimedia data with geo-tags has been generated and collected. In this paper, we investigate a novel type of spatial query problem, named reverse spatial visual top- query (RSVQ k ) that aims to retrieve a set of geo-images that have the query as one of the most relevant geo-images in both geographical proximity and visual similarity. Existing approaches for reverse top- queries are not suitable to address this problem because they cannot effectively process unstructured data, such as image. To this end, firstly we propose the definition of RSVQ k problem and introduce the similarity measurement. A novel hybrid index, named VR 2 -Tree is designed, which is a combination of visual representation of geo-image and R-Tree. Besides, an extension of VR 2 -Tree, called CVR 2 -Tree is introduced and then we discuss the calculation of lower/upper bound, and then propose the optimization technique via CVR 2 -Tree for further pruning. In addition, a search algorithm named RSVQ k algorithm is developed to support the efficient RSVQ k query. Comprehensive experiments are conducted on four geo-image datasets, and the results illustrate that our approach can address the RSVQ k problem effectively and efficiently
Semantic Flooding: Semantic Search across Distributed Lightweight Ontologies
Lightweight ontologies are trees where links between nodes codify the fact that a node lower in the hierarchy describes a topic (and contains documents about this topic) which is more specific than the topic of the node one level above. In turn, multiple lightweight ontologies can be connected by semantic links which represent mappings among them and which can be computed, e.g., by ontology matching. In this paper we describe how these two types of links can be used to define a semantic overlay network which can cover any number of peers and which can be flooded to perform a semantic search on documents, i.e., to perform semantic flooding. We have evaluated our approach by simulating a network of 10,000 peers containing classifications which are fragments of the DMoz web directory. The results are promising and show that, in our approach, only a relatively small number of peers needs to be queried in order to achieve high accuracy
Any-k: Anytime Top-k Tree Pattern Retrieval in Labeled Graphs
Many problems in areas as diverse as recommendation systems, social network
analysis, semantic search, and distributed root cause analysis can be modeled
as pattern search on labeled graphs (also called "heterogeneous information
networks" or HINs). Given a large graph and a query pattern with node and edge
label constraints, a fundamental challenge is to nd the top-k matches ac-
cording to a ranking function over edge and node weights. For users, it is di
cult to select value k . We therefore propose the novel notion of an any-k
ranking algorithm: for a given time budget, re- turn as many of the top-ranked
results as possible. Then, given additional time, produce the next lower-ranked
results quickly as well. It can be stopped anytime, but may have to continues
until all results are returned. This paper focuses on acyclic patterns over
arbitrary labeled graphs. We are interested in practical algorithms that
effectively exploit (1) properties of heterogeneous networks, in particular
selective constraints on labels, and (2) that the users often explore only a
fraction of the top-ranked results. Our solution, KARPET, carefully integrates
aggressive pruning that leverages the acyclic nature of the query, and
incremental guided search. It enables us to prove strong non-trivial time and
space guarantees, which is generally considered very hard for this type of
graph search problem. Through experimental studies we show that KARPET achieves
running times in the order of milliseconds for tree patterns on large networks
with millions of nodes and edges.Comment: To appear in WWW 201
Finding Patterns in a Knowledge Base using Keywords to Compose Table Answers
We aim to provide table answers to keyword queries against knowledge bases.
For queries referring to multiple entities, like "Washington cities population"
and "Mel Gibson movies", it is better to represent each relevant answer as a
table which aggregates a set of entities or entity-joins within the same table
scheme or pattern. In this paper, we study how to find highly relevant patterns
in a knowledge base for user-given keyword queries to compose table answers. A
knowledge base can be modeled as a directed graph called knowledge graph, where
nodes represent entities in the knowledge base and edges represent the
relationships among them. Each node/edge is labeled with type and text. A
pattern is an aggregation of subtrees which contain all keywords in the texts
and have the same structure and types on node/edges. We propose efficient
algorithms to find patterns that are relevant to the query for a class of
scoring functions. We show the hardness of the problem in theory, and propose
path-based indexes that are affordable in memory. Two query-processing
algorithms are proposed: one is fast in practice for small queries (with small
patterns as answers) by utilizing the indexes; and the other one is better in
theory, with running time linear in the sizes of indexes and answers, which can
handle large queries better. We also conduct extensive experimental study to
compare our approaches with a naive adaption of known techniques.Comment: VLDB 201
No-But-Semantic-Match: Computing Semantically Matched XML Keyword Search Results
Users are rarely familiar with the content of a data source they are
querying, and therefore cannot avoid using keywords that do not exist in the
data source. Traditional systems may respond with an empty result, causing
dissatisfaction, while the data source in effect holds semantically related
content. In this paper we study this no-but-semantic-match problem on XML
keyword search and propose a solution which enables us to present the top-k
semantically related results to the user. Our solution involves two steps: (a)
extracting semantically related candidate queries from the original query and
(b) processing candidate queries and retrieving the top-k semantically related
results. Candidate queries are generated by replacement of non-mapped keywords
with candidate keywords obtained from an ontological knowledge base. Candidate
results are scored using their cohesiveness and their similarity to the
original query. Since the number of queries to process can be large, with each
result having to be analyzed, we propose pruning techniques to retrieve the
top- results efficiently. We develop two query processing algorithms based
on our pruning techniques. Further, we exploit a property of the candidate
queries to propose a technique for processing multiple queries in batch, which
improves the performance substantially. Extensive experiments on two real
datasets verify the effectiveness and efficiency of the proposed approaches.Comment: 24 pages, 21 figures, 6 tables, submitted to The VLDB Journal for
possible publicatio
- …