2 research outputs found
BitPath -- Label Order Constrained Reachability Queries over Large Graphs
In this paper we focus on the following constrained reachability problem over
edge-labeled graphs like RDF -- "given source node x, destination node y, and a
sequence of edge labels (a, b, c, d), is there a path between the two nodes
such that the edge labels on the path satisfy a regular expression
"*a.*b.*c.*d.*". A "*" before "a" allows any other edge label to appear on the
path before edge "a". "a.*" forces at least one edge with label "a". ".*" after
"a" allows zero or more edge labels after "a" and before "b". Our query
processing algorithm uses simple divide-and-conquer and greedy pruning
procedures to limit the search space. However, our graph indexing technique --
based on "compressed bit-vectors" -- allows indexing large graphs which
otherwise would have been infeasible. We have evaluated our approach on graphs
with more than 22 million edges and 6 million nodes -- much larger compared to
the datasets used in the contemporary work on path queries
BitPath β Label Order Constrained Reachability Queries over Large Graphs
Abstract. In this paper we focus on the following constrained reachability problem over edge-labeled graphs like RDF β given source node x, destination node y, and a sequence of edge labels (a, b, c, d), is there a path between the two nodes such that the edge labels on the path satisfy a regular expression β*a.*b.*c.*d.*β. A β* β before βa β allows any other edge label to appear on the path before edge βaβ. βa.* β forces at least one edge with label βaβ. β.* β after βa β allows zero or more edge labels after βa β and before βbβ. Our query processing algorithm uses simple divide-and-conquer and greedy pruning procedures to limit the search space. However, our graph indexing technique β based on compressed bit-vectors β allows indexing large graphs which otherwise would have been infeasible. We have evaluated our approach on graphs with more than 22 million edges and 6 million nodes β much larger compared to the datasets used in the contemporary work on path queries.