270 research outputs found

    Constructing Optimal Bushy Trees Possibly Containing Cross Products for Order Preserving Joins is in P

    Full text link
    One of the main features of XQuery compared to traditional query languages like SQL, is that it preserves the input order - unless specified otherwise. As a consequence, order-preserving algebraic operators are needed to capture the semantics of XQuery correctly. One important algebraic operator is the order-preserving join. The order-preserving join is associative but, in contrast to the traditional join operator, not commutative. Since join ordering (i.e. finding the optimal execution plan for a given set of join operators) has been an important topic of query optimization for SQL, it is expected that it will also play a major role in optimizing XQuery. The search space for ordering traditional joins is exponential in size. Although the lack of commutativity reduces the search space for ordering order-preserving joins, we show that it is still exponential. This raises the question whether the join ordering problem is also NP-hard, as in the traditional setting. We answer this question by introducing the first polynomial algorithm that produces optimal bushy trees possibly containing cross products

    Small materialized aggregates : a light weight index structure for data warehousing

    Get PDF
    Small Materialized Aggregates (SMAs for short) are considered a highly flexible and versatile alternative for materialized data cubes. The basic idea is to compute many aggregate values for small to medium-sized buckets of tuples. These aggregates are then used to speed up query processing. We present the general idea and present an application of SMAs to the TPC-D benchmark. We show that application of SMAs to TPC-D Query 1 results in a speed up of two orders of magnitude. Then, we elaborate on the problem of query processing in the presence of SMAs. Last, we briefly discuss some further tuning possibilities for SMAs

    Compiling Away Set Containment and Intersection Joins

    Full text link
    We investigate the effect of query rewriting on joins involving set-valued attributes in object-relational database management systems. We show that by unnesting set-valued attributes (that are stored in an internal nested representation) prior to the actual set containment or intersection join we can improve the performance of query evaluation by an order of magnitude. By giving example query evaluation plans we show the increased possibilities for the query optimizer

    YAXQL : A powerful and web-aware query language supporting query reuse and data integration

    Get PDF
    Since XML seems to be the next great wave on the web, several query languages for XML have been proposed. Unfortunately, none of these proposals comes even close to meet the requirements for such a query language. We review the requirements for a query language for XML and propose a new query language, YAXQL, which meet them

    Constructing Optimal Bushy Processing Trees for Join Queries is NP-hard

    Full text link
    We show that constructing optimal bushy processing trees for join queriesis NP-hard. More specifically, we show that even the construction of optimal bushy trees for computing the cross product for a set of relations is NP-hard

    Dynamic Programming: The Next Step

    Full text link
    Since 2013, dynamic programming (DP)-based plan generators are capable of correctly reordering not only inner joins, but also outer joins. Now, we consider the next big step: reordering not only joins, but also joins and grouping. Since only reorderings of grouping with inner joins are known, we first develop equivalences which allow reordering of grouping with outer joins. Then, we show how to extend a state-of-the-art DP-based plan generator to fully explore these new plan alternatives

    A Study of Four Index Structures for Set-Valued Attributes of Low Cardinality

    Full text link
    We review and study the performance of four different index structures for indexing set-valued attributes designed to speed up set equality, subset and superset queries. All index structures are based on traditional techniques, namely signatures and inverted files. More specifically, we consider sequential signature files, signature trees, extendible signature hashing, and a B-tree based implementation of inverted lists. The latter is refined by a compression scheme in order to keep space requirements within acceptable bounds. The performance study is based on real implementations subjected to a benchmark accounting for different set sizes, domain sizes, and data distributions (uniform and skewed)

    An Efficient Framework for Order Optimization

    Full text link
    Since the introduction of cost-based query optimization, the performance-critical role of interesting orders has been recognized. Some algebraic operators change interesting orders (e.g. sort and select), while others exploit interesting orders (e.g. merge join). The two operations performed by any query optimizer during plan generation are 1) computing the resulting order given an input order and an algebraic operator and 2) determining the compatibility between a given input order and the required order a given algebraic operator can beneficially exploit. Since these two operations are called millions of times during plan generation, they are highly performance-critical. The third crucial parameter is the space requirement for annotating every plan node with its output order. Lately, a powerful framework for reasoning about orders has been developed, which is based on functional dependencies. Within this framework, the current state-of-the-art algorithms for implementing the above operations both have a lower bound time requirement of Omega(n), where n is the number of functional dependencies involved. Further, the lower bound for the space requirement for every plan node is Omega(n). We improve these bounds by new algorithms with upper time bounds O(1). That is, our algorithms for both operations work in constant time during plan generation, after a one-time preparation step. Further, the upper bound for the space requirement for plan nodes is O(1) for our approach. Besides, our algorithm reduces the search space by detecting and ignoring irrelevant orderings. Experimental results with a full fledged query optimizer show that our approach significantly reduces the total time needed for plan generation. As a corollary of our experiments, it follows that the time spent for order processing is a non-neglectable part of plan generation

    Evaluation of Main Memory : Join Algorithms for Joins with Set Comparison Predicates

    Full text link
    Current data models like the NF2 model and object-oriented models support set-valued attributes. Hence, it becomes possible to have join predicates based on set comparison. This paper introduces and evaluates several main memory algorithms to evaluate efficiently this kind of join. More specifically, we concentrate on the set equality and the subset predicates

    Dynamic programming strikes back

    Get PDF
    Two highly efficient algorithms are known for optimally ordering joins while avoiding cross products: DPccp, which is based on dynamic programming, and Top-Down Partition Search, based on memoization. Both have two severe limitations: They handle only (1) simple (binary) join predicates and (2) inner joins. However, real queries may contain complex join predicates, involving more than two relations, and outer joins as well as other non-inner joins. Taking the most efficient known join-ordering algorithm, DPccp, as a starting point, we first develop a new algorithm, DPhyp, which is capable to handle complex join predicates efficiently. We do so by modeling the query graph as a (variant of a) hypergraph and then reason about its connected subgraphs. Then, we present a technique to exploit this capability to efficiently handle the widest class of non-inner joins dealt with so far. Our experimental results show that this reformulation of non-inner joins as complex predicates can improve optimization time by orders of magnitude, compared to known algorithms dealing with complex join predicates and non-inner joins. Once again, this gives dynamic programming a distinct advantage over current memoization techniques
    • …
    corecore