8 research outputs found

    Constructing Optimal Bushy Processing Trees for Join Queries is NP-hard

    Full text link
    We show that constructing optimal bushy processing trees for join queriesis NP-hard. More specifically, we show that even the construction of optimal bushy trees for computing the cross product for a set of relations is NP-hard

    Location-Dependent Query Processing Under Soft Real-Time Constraints

    Get PDF

    Mobile Join Operators for Restricted Sources

    Get PDF

    Domain Ordering and Box Cover Problems for Beyond Worst-Case Join Processing

    Get PDF
    Join queries are a fundamental computational task in relational database management systems. For decades, complex joins were most often computed by decomposing the query into a query plan made of a sequence of binary joins. However, for cyclic queries, this type of query plan is sub-optimal. The worst-case run time of any such query plan exceeds the number of output tuples for any query instance. Recent theoretical developments in join query processing have led to join algorithms which are worst-case optimal, meaning that they run in time proportional to the worst-case output size for any query with the same shape and the same number of input tuples. Building on these results are a class of algorithms providing bounds which go beyond this worst-case output size by exploiting the structure of the input instance rather than just the query shape. One such algorithm, Tetris, is worst-case optimal and also provides an upper bound on its run time which depends on the minimum size of a geometric box certificate for the input query. A box certificate is a subset of a box cover whose union covers every tuple which is not present in the query output. A box cover is a set of n-dimensional boxes which cover all of the tuples not contained in the input relations. Many query instances admit different box certificates and box covers when the values in the attributes' domains are ordered differently. If we permute the input query according to a domain ordering which admits a smaller box certificate, use the permuted query as input to Tetris, then transform the result back with the inverse domain ordering, we can compute the query faster than was possible if the domain ordering was fixed. If we can efficiently compute an optimal domain ordering for a query, then we can state a beyond worst-case bound that is stronger than what is provided by Tetris. This paper defines several optimization problems over the space of domain orderings where the objective is to minimize the size of either the minimum box certificate or the minimum box cover for the given input query. We show that most of these problems are NP-hard. We also provide approximation algorithms for several of these problems. The most general version of the box cover minimization problem we will study, BoxMinPDomF, is shown to be NP-hard, but we can compute an approximation only a poly-logarithmic factor larger than K^(a*r), where K is the minimum box cover size under any domain ordering and r is the maximum number of attributes in a relation. This result allows us to compute join queries in time N+K^(a*r*(w+1))+Z, times a poly-logarithmic factor in N, where N is the number of input tuples, w is the treewidth of the query, and Z is the number of output tuples. This is a new beyond worst-case bound. There are queries for which this bound is exponentially smaller than any bound provided by Tetris. The most general version of the box certificate minimization problem we study, CertMinPDomF, is also shown to be NP-hard. It can be computed exactly if the minimum box certificate size is at most 3, but no approximation algorithm for an arbitrary minimum size is known. Finding such an approximation algorithm is an important direction for future research

    Algebraic Query Optimization in Database Systems (Algebraische Anfrageoptimierung in Datenbanksystemen)

    Get PDF
    The thesis investigates different problem classes in algebraic query optimization. For the problem of computing optimal left-deep processing trees with cross products for chain queries and ASI cost functions we present two efficient algorithms. Although, in practice both algorithms yield identical results we have not been able to prove this. For the case of acyclic query graphs, left-deep processing trees, expensive selection and join predicates and ASI cost functions we describe a polynomial time algorithm which is based on a job sequencing algorithm. The algorithm assumes that the set of expensive selections that can be applied directly to the base relations can be guessed. The cheapest plans can be found within the search space of bushy processing trees with cross products. We prove that the problem is NP-hard in this case. The rest of the thesis deals with the general problem of computing optimal bushy processing trees for arbitrary query graphs and expensive selection and join predicates. For this problem we present three efficient dynamic programming algorithms. Our algorithms can handle different join algorithms, split conjunctive predicates, and exploit structural information from the join graph to speed up computation. The time and space complexities of the algorithms are analyzed carefully and efficient implementations based on bitvector arithmetic are presented

    Multi-join optimization for symmetric multiprocessors

    No full text
    This paper looks at the problem of multi-join query optimization for symmetric multiproceasore. Optimizrt-lion algorithms based on dynamic programming and greedy heuristics are described that, unlike traditional methods, include memory resources and pipelining in their cost model. An analytical model is presented and used to compare the quality of plans produced by each optimization algorithm. Experimental results show that, while dynamic programming produces the be & plans, simple heuristics often do nearly as well. The came results are also used to highlight the advan-tages of bushy execution trees over more restricted tree shapes.
    corecore