6 research outputs found
Efficient Generation and Execution of DAG-Structured Query Graphs
Traditional database management systems use tree-structured query evaluation plans. While easy to implement, a tree-structured query evaluation plan is not expressive enough for some optimizations like factoring common algebraic subexpressions or magic sets. These require directed acyclic graphs (DAGs), i.e. shared subplans. This work covers the different aspects of DAG-structured query graphs. First, it introduces a novel framework to reason about sharing of subplans and thus DAG-structured query evaluation plans. Second, it describes the first plan generator capable of generating optimal DAG-structured query evaluation plans. Third, an efficient framework for reasoning about orderings and groupings used by the plan generator is presented. And fourth, a runtime system capable of executing DAG-structured query evaluation plans with minimal overhead is discussed. The experimental results show that with no or only a modest increase of plan generation time, a major reduction of query execution time can be achieved for common queries. This shows that DAG-structured query evaluation plans are serviceable and should be preferred over tree-structured query plans
Pay One, Get Hundreds for Free: Reducing Cloud Costs through Shared Query Execution
Cloud-based data analysis is nowadays common practice because of the lower
system management overhead as well as the pay-as-you-go pricing model. The
pricing model, however, is not always suitable for query processing as heavy
use results in high costs. For example, in query-as-a-service systems, where
users are charged per processed byte, collections of queries accessing the same
data frequently can become expensive. The problem is compounded by the limited
options for the user to optimize query execution when using declarative
interfaces such as SQL. In this paper, we show how, without modifying existing
systems and without the involvement of the cloud provider, it is possible to
significantly reduce the overhead, and hence the cost, of query-as-a-service
systems. Our approach is based on query rewriting so that multiple concurrent
queries are combined into a single query. Our experiments show the aggregated
amount of work done by the shared execution is smaller than in a
query-at-a-time approach. Since queries are charged per byte processed, the
cost of executing a group of queries is often the same as executing a single
one of them. As an example, we demonstrate how the shared execution of the
TPC-H benchmark is up to 100x and 16x cheaper in Amazon Athena and Google
BigQuery than using a query-at-a-time approach while achieving a higher
throughput
Generating optimal DAG-structured query evaluation plans
In many database queries relations are access multiple times during query
processing. In these cases query processing can be accelerated
by sharing scan operators and possibly other operators
based upon the common relations.
The standard approach to achieve sharing works as follows.
In a first phase, a non-shared tree-shaped plan is generated via a traditional
plan generator.
In a second phase, common instances of a scan are detected and shared.
After that, other possible operators are shared.
The result is an operator DAG (directed acyclic graph).
The limitation of this approach is obvious.
As sharing influences plan costs, a separation of the optimization into
two phases comprises the danger of missing the optimal plan,
since the first optimization phase does not know about sharing.
We remedy this situation by (1) introducing a general framework for reasoning
about sharing and (2) sketching how this framework can be integrated into a
plan generator,
which then constructs optimal DAG-structured query evaluation plans
Forschungsbericht Universität Mannheim 2008 / 2009
Die Universität Mannheim hat seit ihrer Entstehung ein spezifisches Forschungsprofil,
welches sich in ihrer Entwicklung und derz
eitigen Struktur deutlich widerspiegelt. Es ist geprägt von national und international
sehr anerkannten Wirtschafts- und
Sozialwissenschaften und deren Vernetzung mit leistungsstarken Geisteswissenschaften, Rechtswissenschaft sowie Mathematik und Informatik.
Die Universität Mannheim wird auch in Zukunft
einerseits die Forschungsschwerpunkte in den Wirtschafts- und Sozialwissenschaften fördern und andererseits eine interdisziplinäre Kultur im
Zusammenspiel aller Fächer der Universität
anstreben
Single Phase Construction of Optimal DAG-structured
Traditionally, database management systems use tree-structured query evaluation plans. They are easy to implement but not expressive enough for some optimizations like eliminating common algebraic subexpressions or magic sets. These require directed acyclic graphs (DAGs), i.e. shared subplans. Existing approaches consider DAGs merely for special cases and not in full generality. We introduce a novel framework to reason about sharing of subplans and, thus, DAG-structured query evaluation plans. Then, we present the first plan generator capable of generating optimal DAG-structured query evaluation plans. The experimental results show that with no or only a modest increase of plan generation time, a major reduction of quer