13 research outputs found
DDSL: Efficient Subgraph Listing on Distributed and Dynamic Graphs
Subgraph listing is a fundamental problem in graph theory and has wide
applications in areas like sociology, chemistry, and social networks. Modern
graphs can usually be large-scale as well as highly dynamic, which challenges
the efficiency of existing subgraph listing algorithms. Recent works have shown
the benefits of partitioning and processing big graphs in a distributed system,
however, there is only few work targets subgraph listing on dynamic graphs in a
distributed environment. In this paper, we propose an efficient approach,
called Distributed and Dynamic Subgraph Listing (DDSL), which can incrementally
update the results instead of running from scratch. DDSL follows a general
distributed join framework. In this framework, we use a Neighbor-Preserved
storage for data graphs, which takes bounded extra space and supports dynamic
updating. After that, we propose a comprehensive cost model to estimate the I/O
cost of listing subgraphs. Then based on this cost model, we develop an
algorithm to find the optimal join tree for a given pattern. To handle dynamic
graphs, we propose an efficient left-deep join algorithm to incrementally
update the join results. Extensive experiments are conducted on real-world
datasets. The results show that DDSL outperforms existing methods in dealing
with both static dynamic graphs in terms of the responding time
Optimizing Subgraph Queries by Combining Binary and Worst-Case Optimal Joins
We study the problem of optimizing subgraph queries using the new worst-case
optimal join plans. Worst-case optimal plans evaluate queries by matching one
query vertex at a time using multiway intersections. The core problem in
optimizing worst-case optimal plans is to pick an ordering of the query
vertices to match. We design a cost-based optimizer that (i) picks efficient
query vertex orderings for worst-case optimal plans; and (ii) generates hybrid
plans that mix traditional binary joins with worst-case optimal style multiway
intersections. Our cost metric combines the cost of binary joins with a new
cost metric called intersection-cost. The plan space of our optimizer contains
plans that are not in the plan spaces based on tree decompositions from prior
work. In addition to our optimizer, we describe an adaptive technique that
changes the orderings of the worst-case optimal sub-plans during query
execution. We demonstrate the effectiveness of the plans our optimizer picks
and adaptive technique through extensive experiments. Our optimizer is
integrated into the Graphflow DBMS