7,730 research outputs found
Adding Logical Operators to Tree Pattern Queries on Graph-Structured Data
As data are increasingly modeled as graphs for expressing complex
relationships, the tree pattern query on graph-structured data becomes an
important type of queries in real-world applications. Most practical query
languages, such as XQuery and SPARQL, support logical expressions using
logical-AND/OR/NOT operators to define structural constraints of tree patterns.
In this paper, (1) we propose generalized tree pattern queries (GTPQs) over
graph-structured data, which fully support propositional logic of structural
constraints. (2) We make a thorough study of fundamental problems including
satisfiability, containment and minimization, and analyze the computational
complexity and the decision procedures of these problems. (3) We propose a
compact graph representation of intermediate results and a pruning approach to
reduce the size of intermediate results and the number of join operations --
two factors that often impair the efficiency of traditional algorithms for
evaluating tree pattern queries. (4) We present an efficient algorithm for
evaluating GTPQs using 3-hop as the underlying reachability index. (5)
Experiments on both real-life and synthetic data sets demonstrate the
effectiveness and efficiency of our algorithm, from several times to orders of
magnitude faster than state-of-the-art algorithms in terms of evaluation time,
even for traditional tree pattern queries with only conjunctive operations.Comment: 16 page
Regular Queries on Graph Databases
Graph databases are currently one of the most popular paradigms for storing data. One of the key conceptual differences between graph and relational databases is the focus on navigational queries that ask whether some nodes are connected by paths satisfying certain restrictions. This focus has driven the definition of several different query languages and the subsequent study of their fundamental properties.
We define the graph query language of Regular Queries, which is a natural extension of unions of conjunctive 2-way regular path queries (UC2RPQs) and unions of conjunctive nested 2-way regular path queries (UCN2RPQs). Regular queries allow expressing complex regular patterns between nodes. We formalize regular queries as nonrecursive Datalog programs with transitive closure rules. This language has been previously considered, but its algorithmic properties are not well understood.
Our main contribution is to show elementary tight bounds for the containment problem for regular queries. Specifically, we show that this problem is 2EXPSPACE-complete. For all extensions of regular queries known to date, the containment problem turns out to be non-elementary. Together with the fact that evaluating regular queries is not harder than evaluating UCN2RPQs, our results show that regular queries achieve a good balance between expressiveness and complexity, and constitute a well-behaved class that deserves further investigation
Parallel-Correctness and Containment for Conjunctive Queries with Union and Negation
Single-round multiway join algorithms first reshuffle data over many servers
and then evaluate the query at hand in a parallel and communication-free way. A
key question is whether a given distribution policy for the reshuffle is
adequate for computing a given query, also referred to as parallel-correctness.
This paper extends the study of the complexity of parallel-correctness and its
constituents, parallel-soundness and parallel-completeness, to unions of
conjunctive queries with and without negation. As a by-product it is shown that
the containment problem for conjunctive queries with negation is
coNEXPTIME-complete
Efficient Query Processing for SPARQL Federations with Replicated Fragments
Low reliability and availability of public SPARQL endpoints prevent
real-world applications from exploiting all the potential of these querying
infras-tructures. Fragmenting data on servers can improve data availability but
degrades performance. Replicating fragments can offer new tradeoff between
performance and availability. We propose FEDRA, a framework for querying Linked
Data that takes advantage of client-side data replication, and performs a
source selection algorithm that aims to reduce the number of selected public
SPARQL endpoints, execution time, and intermediate results. FEDRA has been
implemented on the state-of-the-art query engines ANAPSID and FedX, and
empirically evaluated on a variety of real-world datasets
Adding regular expressions to graph reachability and pattern queries
Abstract—It is increasingly common to find graphs in which edges bear different types, indicating a variety of relationships. For such graphs we propose a class of reachability queries and a class of graph patterns, in which an edge is specified with a regular expression of a certain form, expressing the connectivity in a data graph via edges of various types. In addition, we define graph pattern matching based on a revised notion of graph simulation. On graphs in emerging applications such as social networks, we show that these queries are capable of finding more sensible information than their traditional counterparts. Better still, their increased expressive power does not come with extra complexity. Indeed, (1) we investigate their containment and minimization problems, and show that these fundamental problems are in quadratic time for reachability queries and are in cubic time for pattern queries. (2) We develop an algorithm for answering reachability queries, in quadratic time as for their traditional counterpart. (3) We provide two cubic-time algorithms for evaluating graph pattern queries based on extended graph simulation, as opposed to the NP-completeness of graph pattern matching via subgraph isomorphism. (4) The effectiveness, efficiency and scalability of these algorithms are experimentally verified using real-life data and synthetic data. I
- …