2 research outputs found

    BitPath -- Label Order Constrained Reachability Queries over Large Graphs

    Full text link
    In this paper we focus on the following constrained reachability problem over edge-labeled graphs like RDF -- "given source node x, destination node y, and a sequence of edge labels (a, b, c, d), is there a path between the two nodes such that the edge labels on the path satisfy a regular expression "*a.*b.*c.*d.*". A "*" before "a" allows any other edge label to appear on the path before edge "a". "a.*" forces at least one edge with label "a". ".*" after "a" allows zero or more edge labels after "a" and before "b". Our query processing algorithm uses simple divide-and-conquer and greedy pruning procedures to limit the search space. However, our graph indexing technique -- based on "compressed bit-vectors" -- allows indexing large graphs which otherwise would have been infeasible. We have evaluated our approach on graphs with more than 22 million edges and 6 million nodes -- much larger compared to the datasets used in the contemporary work on path queries

    BitPath – Label Order Constrained Reachability Queries over Large Graphs

    No full text
    Abstract. In this paper we focus on the following constrained reachability problem over edge-labeled graphs like RDF – given source node x, destination node y, and a sequence of edge labels (a, b, c, d), is there a path between the two nodes such that the edge labels on the path satisfy a regular expression β€œ*a.*b.*c.*d.*”. A β€œ* ” before β€œa ” allows any other edge label to appear on the path before edge β€œa”. β€œa.* ” forces at least one edge with label β€œa”. β€œ.* ” after β€œa ” allows zero or more edge labels after β€œa ” and before β€œb”. Our query processing algorithm uses simple divide-and-conquer and greedy pruning procedures to limit the search space. However, our graph indexing technique – based on compressed bit-vectors – allows indexing large graphs which otherwise would have been infeasible. We have evaluated our approach on graphs with more than 22 million edges and 6 million nodes – much larger compared to the datasets used in the contemporary work on path queries.
    corecore