Search CORE

270 research outputs found

Constructing Optimal Bushy Trees Possibly Containing Cross Products for Order Preserving Joins is in P

Author: Moerkotte Guido
Publication venue
Publication date: 01/01/2003
Field of study

One of the main features of XQuery compared to traditional query languages like SQL, is that it preserves the input order - unless specified otherwise. As a consequence, order-preserving algebraic operators are needed to capture the semantics of XQuery correctly. One important algebraic operator is the order-preserving join. The order-preserving join is associative but, in contrast to the traditional join operator, not commutative. Since join ordering (i.e. finding the optimal execution plan for a given set of join operators) has been an important topic of query optimization for SQL, it is expected that it will also play a major role in optimizing XQuery. The search space for ordering traditional joins is exponential in size. Although the lack of commutativity reduces the search space for ordering order-preserving joins, we show that it is still exponential. This raises the question whether the join ordering problem is also NP-hard, as in the traditional setting. We answer this question by introducing the first polynomial algorithm that produces optimal bushy trees possibly containing cross products

MAnnheim DOCument Server

Small materialized aggregates : a light weight index structure for data warehousing

Author: Moerkotte Guido
Publication venue
Publication date: 01/01/1998
Field of study

Small Materialized Aggregates (SMAs for short) are considered a highly flexible and versatile alternative for materialized data cubes. The basic idea is to compute many aggregate values for small to medium-sized buckets of tuples. These aggregates are then used to speed up query processing. We present the general idea and present an application of SMAs to the TPC-D benchmark. We show that application of SMAs to TPC-D Query 1 results in a speed up of two orders of magnitude. Then, we elaborate on the problem of query processing in the presence of SMAs. Last, we briefly discuss some further tuning possibilities for SMAs

CiteSeerX

MAnnheim DOCument Server

Compiling Away Set Containment and Intersection Joins

Author: Helmer Sven
Moerkotte Guido
Publication venue
Publication date: 01/01/2002
Field of study

We investigate the effect of query rewriting on joins involving set-valued attributes in object-relational database management systems. We show that by unnesting set-valued attributes (that are stored in an internal nested representation) prior to the actual set containment or intersection join we can improve the performance of query evaluation by an order of magnitude. By giving example query evaluation plans we show the increased possibilities for the query optimizer

MAnnheim DOCument Server

YAXQL : A powerful and web-aware query language supporting query reuse and data integration

Author: Fiebig Thorsten
Moerkotte Guido
Publication venue
Publication date: 01/01/2000
Field of study

Since XML seems to be the next great wave on the web, several query languages for XML have been proposed. Unfortunately, none of these proposals comes even close to meet the requirements for such a query language. We review the requirements for a query language for XML and propose a new query language, YAXQL, which meet them

CiteSeerX

MAnnheim DOCument Server

Constructing Optimal Bushy Processing Trees for Join Queries is NP-hard

Author: Moerkotte Guido
Scheufele Wolfgang
Publication venue
Publication date: 01/01/1996
Field of study

We show that constructing optimal bushy processing trees for join queriesis NP-hard. More specifically, we show that even the construction of optimal bushy trees for computing the cross product for a set of relations is NP-hard

MAnnheim DOCument Server

Dynamic Programming: The Next Step

Author: Eich Marius
Moerkotte Guido
Publication venue
Publication date: 01/01/2014
Field of study

Since 2013, dynamic programming (DP)-based plan generators are capable of correctly reordering not only inner joins, but also outer joins. Now, we consider the next big step: reordering not only joins, but also joins and grouping. Since only reorderings of grouping with inner joins are known, we first develop equivalences which allow reordering of grouping with outer joins. Then, we show how to extend a state-of-the-art DP-based plan generator to fully explore these new plan alternatives

MAnnheim DOCument Server

A Study of Four Index Structures for Set-Valued Attributes of Low Cardinality

Author: Helmer Sven
Moerkotte Guido
Publication venue
Publication date: 01/01/1999
Field of study

We review and study the performance of four different index structures for indexing set-valued attributes designed to speed up set equality, subset and superset queries. All index structures are based on traditional techniques, namely signatures and inverted files. More specifically, we consider sequential signature files, signature trees, extendible signature hashing, and a B-tree based implementation of inverted lists. The latter is refined by a compression scheme in order to keep space requirements within acceptable bounds. The performance study is based on real implementations subjected to a benchmark accounting for different set sizes, domain sizes, and data distributions (uniform and skewed)

MAnnheim DOCument Server

An Efficient Framework for Order Optimization

Author: Moerkotte Guido
Neumann Thomas
Publication venue
Publication date: 01/01/2003
Field of study

Since the introduction of cost-based query optimization, the performance-critical role of interesting orders has been recognized. Some algebraic operators change interesting orders (e.g. sort and select), while others exploit interesting orders (e.g. merge join). The two operations performed by any query optimizer during plan generation are 1) computing the resulting order given an input order and an algebraic operator and 2) determining the compatibility between a given input order and the required order a given algebraic operator can beneficially exploit. Since these two operations are called millions of times during plan generation, they are highly performance-critical. The third crucial parameter is the space requirement for annotating every plan node with its output order. Lately, a powerful framework for reasoning about orders has been developed, which is based on functional dependencies. Within this framework, the current state-of-the-art algorithms for implementing the above operations both have a lower bound time requirement of Omega(n), where n is the number of functional dependencies involved. Further, the lower bound for the space requirement for every plan node is Omega(n). We improve these bounds by new algorithms with upper time bounds O(1). That is, our algorithms for both operations work in constant time during plan generation, after a one-time preparation step. Further, the upper bound for the space requirement for plan nodes is O(1) for our approach. Besides, our algorithm reduces the search space by detecting and ignoring irrelevant orderings. Experimental results with a full fledged query optimizer show that our approach significantly reduces the total time needed for plan generation. As a corollary of our experiments, it follows that the time spent for order processing is a non-neglectable part of plan generation

MAnnheim DOCument Server

Evaluation of Main Memory : Join Algorithms for Joins with Set Comparison Predicates

Author: Helmer Sven
Moerkotte Guido
Publication venue
Publication date: 01/01/1996
Field of study

Current data models like the NF2 model and object-oriented models support set-valued attributes. Hence, it becomes possible to have join predicates based on set comparison. This paper introduces and evaluates several main memory algorithms to evaluate efficiently this kind of join. More specifically, we concentrate on the set equality and the subset predicates

MAnnheim DOCument Server

Dynamic programming strikes back

Author: Guido Moerkotte
Thomas Neumann
Publication venue
Publication date: 01/01/2008
Field of study

Two highly efficient algorithms are known for optimally ordering joins while avoiding cross products: DPccp, which is based on dynamic programming, and Top-Down Partition Search, based on memoization. Both have two severe limitations: They handle only (1) simple (binary) join predicates and (2) inner joins. However, real queries may contain complex join predicates, involving more than two relations, and outer joins as well as other non-inner joins. Taking the most efficient known join-ordering algorithm, DPccp, as a starting point, we first develop a new algorithm, DPhyp, which is capable to handle complex join predicates efficiently. We do so by modeling the query graph as a (variant of a) hypergraph and then reason about its connected subgraphs. Then, we present a technique to exploit this capability to efficiently handle the widest class of non-inner joins dealt with so far. Our experimental results show that this reformulation of non-inner joins as complex predicates can improve optimization time by orders of magnitude, compared to known algorithms dealing with complex join predicates and non-inner joins. Once again, this gives dynamic programming a distinct advantage over current memoization techniques

CiteSeerX

Crossref

MAnnheim DOCument Server

MPG.PuRe