421 research outputs found
Optimizing Subgraph Queries by Combining Binary and Worst-Case Optimal Joins
We study the problem of optimizing subgraph queries using the new worst-case
optimal join plans. Worst-case optimal plans evaluate queries by matching one
query vertex at a time using multiway intersections. The core problem in
optimizing worst-case optimal plans is to pick an ordering of the query
vertices to match. We design a cost-based optimizer that (i) picks efficient
query vertex orderings for worst-case optimal plans; and (ii) generates hybrid
plans that mix traditional binary joins with worst-case optimal style multiway
intersections. Our cost metric combines the cost of binary joins with a new
cost metric called intersection-cost. The plan space of our optimizer contains
plans that are not in the plan spaces based on tree decompositions from prior
work. In addition to our optimizer, we describe an adaptive technique that
changes the orderings of the worst-case optimal sub-plans during query
execution. We demonstrate the effectiveness of the plans our optimizer picks
and adaptive technique through extensive experiments. Our optimizer is
integrated into the Graphflow DBMS
Enumerating Subgraph Instances Using Map-Reduce
The theme of this paper is how to find all instances of a given "sample"
graph in a larger "data graph," using a single round of map-reduce. For the
simplest sample graph, the triangle, we improve upon the best known such
algorithm. We then examine the general case, considering both the communication
cost between mappers and reducers and the total computation cost at the
reducers. To minimize communication cost, we exploit the techniques of (Afrati
and Ullman, TKDE 2011)for computing multiway joins (evaluating conjunctive
queries) in a single map-reduce round. Several methods are shown for
translating sample graphs into a union of conjunctive queries with as few
queries as possible. We also address the matter of optimizing computation cost.
Many serial algorithms are shown to be "convertible," in the sense that it is
possible to partition the data graph, explore each partition in a separate
reducer, and have the total computation cost at the reducers be of the same
order as the computation cost of the serial algorithm.Comment: 37 page
Instance and Output Optimal Parallel Algorithms for Acyclic Joins
Massively parallel join algorithms have received much attention in recent
years, while most prior work has focused on worst-optimal algorithms. However,
the worst-case optimality of these join algorithms relies on hard instances
having very large output sizes, which rarely appear in practice. A stronger
notion of optimality is {\em output-optimal}, which requires an algorithm to be
optimal within the class of all instances sharing the same input and output
size. An even stronger optimality is {\em instance-optimal}, i.e., the
algorithm is optimal on every single instance, but this may not always be
achievable.
In the traditional RAM model of computation, the classical Yannakakis
algorithm is instance-optimal on any acyclic join. But in the massively
parallel computation (MPC) model, the situation becomes much more complicated.
We first show that for the class of r-hierarchical joins, instance-optimality
can still be achieved in the MPC model. Then, we give a new MPC algorithm for
an arbitrary acyclic join with load O ({\IN \over p} + {\sqrt{\IN \cdot \OUT}
\over p}), where \IN,\OUT are the input and output sizes of the join, and
is the number of servers in the MPC model. This improves the MPC version of
the Yannakakis algorithm by an O (\sqrt{\OUT \over \IN} ) factor.
Furthermore, we show that this is output-optimal when \OUT = O(p \cdot \IN),
for every acyclic but non-r-hierarchical join. Finally, we give the first
output-sensitive lower bound for the triangle join in the MPC model, showing
that it is inherently more difficult than acyclic joins
GraphMatch: Subgraph Query Processing on FPGAs
Efficiently finding subgraph embeddings in large graphs is crucial for many
application areas like biology and social network analysis. Set intersections
are the predominant and most challenging aspect of current join-based subgraph
query processing systems for CPUs. Previous work has shown the viability of
utilizing FPGAs for acceleration of graph and join processing.
In this work, we propose GraphMatch, the first genearl-purpose stand-alone
subgraph query processing accelerator based on worst-case optimal joins (WCOJ)
that is fully designed for modern, field programmable gate array (FPGA)
hardware. For efficient processing of various graph data sets and query graph
patterns, it leverages a novel set intersection approach, called AllCompare,
tailor-made for FPGAs. We show that this set intersection approach efficiently
solves multi-set intersections in subgraph query processing, superior to
CPU-based approaches. Overall, GraphMatch achieves a speedup of over 2.68x and
5.16x, compared to the state-of-the-art systems GraphFlow and RapidMatch,
respectively
Using materialized views for answering graph pattern queries
Discovering patterns in graphs by evaluating graph pattern queries involving direct (edge-to-edge mapping) and reachability (edge-to-path mapping) relationships under homomorphisms on data graphs has been extensively studied. Previous studies have aimed to reduce the evaluation time of graph pattern queries due to the potentially numerous matches on large data graphs.
In this work, the concept of the summary graph is developed to improve the evaluation of tree pattern queries and graph pattern queries. The summary graph first filters out candidate matches which violate certain reachability constraints, and then finds local matches of query edges. This reduces redundancy in the representation of the query results and allows for computation sharing during the generation of these results. Methods using materialized graph pattern views are developed to improve the efficiency of graph pattern query evaluation. A view is materialized as a summary graph, which compactly records all the homomorphisms of the view to the data graph. View usability is characterized in terms of query edge coverage to provide necessary and sufficient conditions for answering queries using views, and algorithms are developed for determining view usability and for summary graph construction.
Experimental evaluation shows that the methods using summary graphs and its related concepts outperform previous state-of-the-art approaches. It also demonstrates that the view materialization method outperforms, by several orders of magnitude, a state-of-the-art approach which does not use materialized views, and substantially improves upon its scalability
A Near-Optimal Parallel Algorithm for Joining Binary Relations
We present a constant-round algorithm in the massively parallel computation
(MPC) model for evaluating a natural join where every input relation has two
attributes. Our algorithm achieves a load of where
is the total size of the input relations, is the number of machines,
is the join's fractional edge covering number, and hides
a polylogarithmic factor. The load matches a known lower bound up to a
polylogarithmic factor. At the core of the proposed algorithm is a new theorem
(which we name {\em the isolated cartesian product theorem}) that provides
fresh insight into the problem's mathematical structure. Our result implies
that the {\em subgraph enumeration problem}, where the goal is to report all
the occurrences of a constant-sized subgraph pattern, can be settled optimally
(up to a polylogarithmic factor) in the MPC model.Comment: Short versions of this article appeared in PODS'17 and ICDT'20. The
article is under submission to a journal. The red sentences are highlighted
for the journal's reviewer
GraphflowDB: Scalable Query Processing on Graph-Structured Relations
Finding patterns over graph-structured datasets is ubiquitous and integral to a wide range of analytical applications, e.g., recommendation and fraud detection. When expressed in the high-level query languages of database management systems (DBMSs), these patterns correspond to many-to-many join computations, which generate very large intermediate relations during query processing and degrade the performance of existing systems.
This thesis argues that modern query processors need to adopt two novel techniques to be efficient on growing many-to-many joins: (i) worst-case optimal join algorithms; and (ii) factorized representations. Traditional query processors generate join plans that use binary joins, which in iteration take two relations, base or intermediate, to join and produce a new relation. The theory of worst-case optimal joins have shown that this style of join processing can be provably suboptimal and hence generate unnecessarily large intermediate results. This can be avoided on cyclic join queries if the join is performed in a multi-way fashion a join-attribute-at-a-time. As its first contribution, this thesis proposes the design and implementation of a query processor and optimizer that can generate plans that mix worst-case optimal joins, i.e., attribute-at-a-time joins and binary joins, i.e., table-at-a-time joins. In contrast to prior approaches with novel join optimizers that require solving hard computational problems, such as computing low-width hypertree decompositions of queries, our join optimizer is cost-based and uses a traditional dynamic programming approach with a new cost metric.
On acyclic queries, or acyclic parts of queries, sometimes the generation of large intermediate results cannot be avoided. Yet, the theory of factorization has shown that often such intermediate results can be highly compressible if they contain multi-valued dependencies between join attributes. Factorization proposes two relation representation schemes, called f- and d-representations, to represent the large intermediate results generated under many-to-many joins in a compressed format. Existing proposals to adopt factorized representations require designing processing on fully materialized general tries and novel operators that operate on entire tries, which are not easy to adopt in existing systems. As a second contribution, we describe the implementation of a novel query processing approach we call factorized vector execution that adopts f-representations. Factorized vector execution extends the traditional vectorized query processors to use multiple blocks of vectors instead of a single block allowing us to factorize intermediate results and delay or even avoid Cartesian products. Importantly, our design ensures that every core operator in the system still performs computations on vectors. As a third contribution, we further describe how to extend our factorized vector execution model with novel operators to adopt d-representations, which extend f-representations with cached and reused sub-relations. Our design here is based on using nested hash tables that can point to sub-relations instead of copying them and on directed acyclic graph-based query plans.
All of our techniques are implemented in the GraphflowDB system, which was developed throughout the years to facilitate the research in this thesis. We demonstrate that GraphflowDB’s query processor can outperform existing approaches and systems by orders of magnitude on both micro-benchmarks and end-to-end benchmarks. The designs proposed in this thesis adopt common-wisdom query processing techniques of pipelining, vector-based execution, and morsel-driven parallelism to ensure easy adoption in existing systems. We believe the design can serve as a blueprint for how to adopt these techniques in existing DBMSs to make them more efficient on workloads with many-to-many joins
- …