1,149 research outputs found

    K-Reach: Who is in Your Small World

    Full text link
    We study the problem of answering k-hop reachability queries in a directed graph, i.e., whether there exists a directed path of length k, from a source query vertex to a target query vertex in the input graph. The problem of k-hop reachability is a general problem of the classic reachability (where k=infinity). Existing indexes for processing classic reachability queries, as well as for processing shortest path queries, are not applicable or not efficient for processing k-hop reachability queries. We propose an index for processing k-hop reachability queries, which is simple in design and efficient to construct. Our experimental results on a wide range of real datasets show that our index is more efficient than the state-of-the-art indexes even for processing classic reachability queries, for which these indexes are primarily designed. We also show that our index is efficient in answering k-hop reachability queries.Comment: VLDB201

    Adding Logical Operators to Tree Pattern Queries on Graph-Structured Data

    Full text link
    As data are increasingly modeled as graphs for expressing complex relationships, the tree pattern query on graph-structured data becomes an important type of queries in real-world applications. Most practical query languages, such as XQuery and SPARQL, support logical expressions using logical-AND/OR/NOT operators to define structural constraints of tree patterns. In this paper, (1) we propose generalized tree pattern queries (GTPQs) over graph-structured data, which fully support propositional logic of structural constraints. (2) We make a thorough study of fundamental problems including satisfiability, containment and minimization, and analyze the computational complexity and the decision procedures of these problems. (3) We propose a compact graph representation of intermediate results and a pruning approach to reduce the size of intermediate results and the number of join operations -- two factors that often impair the efficiency of traditional algorithms for evaluating tree pattern queries. (4) We present an efficient algorithm for evaluating GTPQs using 3-hop as the underlying reachability index. (5) Experiments on both real-life and synthetic data sets demonstrate the effectiveness and efficiency of our algorithm, from several times to orders of magnitude faster than state-of-the-art algorithms in terms of evaluation time, even for traditional tree pattern queries with only conjunctive operations.Comment: 16 page

    High-Performance Reachability Query Processing under Index Size Restrictions

    Full text link
    In this paper, we propose a scalable and highly efficient index structure for the reachability problem over graphs. We build on the well-known node interval labeling scheme where the set of vertices reachable from a particular node is compactly encoded as a collection of node identifier ranges. We impose an explicit bound on the size of the index and flexibly assign approximate reachability ranges to nodes of the graph such that the number of index probes to answer a query is minimized. The resulting tunable index structure generates a better range labeling if the space budget is increased, thus providing a direct control over the trade off between index size and the query processing performance. By using a fast recursive querying method in conjunction with our index structure, we show that in practice, reachability queries can be answered in the order of microseconds on an off-the-shelf computer - even for the case of massive-scale real world graphs. Our claims are supported by an extensive set of experimental results using a multitude of benchmark and real-world web-scale graph datasets.Comment: 30 page

    TopCom: Index for Shortest Distance Query in Directed Graph

    Get PDF
    Finding shortest distance between two vertices in a graph is an important problem due to its numerous applications in diverse domains, including geo-spatial databases, social network analysis, and information retrieval. Classical algorithms (such as, Dijkstra) solve this problem in polynomial time, but these algorithms cannot provide real-time response for a large number of bursty queries on a large graph. So, indexing based solutions that pre-process the graph for efficiently answering (exactly or approximately) a large number of distance queries in real-time is becoming increasingly popular. Existing solutions have varying performance in terms of index size, index building time, query time, and accuracy. In this work, we propose T OP C OM , a novel indexing-based solution for exactly answering distance queries. Our experiments with two of the existing state-of-the-art methods (IS-Label and TreeMap) show the superiority of T OP C OM over these two methods considering scalability and query time. Besides, indexing of T OP C OM exploits the DAG (directed acyclic graph) structure in the graph, which makes it significantly faster than the existing methods if the SCCs (strongly connected component) of the input graph are relatively small

    PReaCH: A Fast Lightweight Reachability Index using Pruning and Contraction Hierarchies

    Full text link
    We develop the data structure PReaCH (for Pruned Reachability Contraction Hierarchies) which supports reachability queries in a directed graph, i.e., it supports queries that ask whether two nodes in the graph are connected by a directed path. PReaCH adapts the contraction hierarchy speedup techniques for shortest path queries to the reachability setting. The resulting approach is surprisingly simple and guarantees linear space and near linear preprocessing time. Orthogonally to that, we improve existing pruning techniques for the search by gathering more information from a single DFS-traversal of the graph. PReaCH-indices significantly outperform previous data structures with comparable preprocessing cost. Methods with faster queries need significantly more preprocessing time in particular for the most difficult instances
    • …
    corecore