18 research outputs found

    An Automated, yet Interactive and Portable DB designer

    Get PDF
    Tuning tools attempt to configure a database to achieve optimal performance for a given workload. Selecting an optimal set of physical structures is computationally hard since it involves searching a vast space of possible configurations. Commercial DBMSs offer tools that can address this problem. The usefulness of such tools, however, is limited by their dependence on greedy heuristics, the need for a-priori (offline) knowledge of the workload, and lack of an optimal materialization schedule to get the best out of suggested design features. Moreover, the open source DBMSs do not provide any automated tuning tools. This demonstration introduces a comprehensive physical designer for the PostgreSQL open source DBMS. The tool suggests design features for both offline and online workloads. It provides close to optimal suggestions for indexes for a given workload by modeling the problem as a combinatorial optimization problem and solving it by sophisticated and mature solvers. It also determines the interaction between indexes to suggest an effective materialization strategy for the selected indexes. The tool is interactive as it allows the database administrator (DBA) to suggest a set of candidate features and shows their benefits and interactions visually. For the demonstration we use large realworld scientific datasets and query workloads

    ABSTRACT Depth Estimation for Ranking Query Optimization

    No full text
    A relational ranking query uses a scoring function to limit the results of a conventional query to a small number of the most relevant answers. The increasing popularity of this query paradigm has led to the introduction of specialized rank join operators that integrate the selection of top tuples with join processing. These operators access just “enough ” of the input in order to generate just “enough” output and can offer significant speed-ups for query evaluation. The number of input tuples that an operator accesses is called the input depth of the operator, and this is the driving cost factor in rank join processing. This introduces the important problem of depth estimation, which is crucial for the costing of rank join operators during query compilation and thus for their integration in optimized physical plans. We introduce an estimation methodology, termed DEEP, for approximating the input depths of rank join operators in a physical execution plan. At the core of DEEP lies a general, principled framework that formalizes depth computation in terms of the joint distribution of scores in the base tables. This framework results in a systematic estimation methodology that takes the characteristics of the data directly into account and thus enables more accurate estimates. We develop novel estimation algorithms that provide an efficient realization of the formal DEEP framework, and describe their integration on top of the statistics module of an existing query optimizer. We validate the performance of DEEP with an extensive experimental study on data sets of varying characteristics. The results verify the effectiveness of DEEP as an estimation method and demonstrate its advantages over previously proposed techniques. 1

    Evaluating rank joins with optimal cost

    No full text
    In the rank join problem, we are given a set of relations and a scoring function, and the goal is to return the join results with the top K scores. It is often the case in practice that the inputs may be accessed in ranked order and the scoring function is monotonic. These conditions allow for efficient algorithms that solve the rank join problem without reading all of the input. In this paper, we present a thorough analysis of such rank join algorithms. A strong point of our analysis is that it is based on a more general problem statement than previous work, making it more relevant to the execution model that is employed by database systems. One of our results indicates that the well known HRJN algorithm has shortcomings, because it does not stop reading its input as soon as possible. We find that it is NP-hard to overcome this weakness in the general case, but cases of limited query complexity are tractable. We prove the latter with an algorithm that infers provably tight bounds on the potential benefit of reading more input in order to stop as soon as possible. As a result, the algorithm achieves a cost that is within a constant factor of optimal

    A Benchmark For Online Index Selection

    No full text
    Abstract — Online approaches to physical design tuning have received considerable attention in the recent literature, with a focus on the problem of online index selection. However, it is difficult to draw conclusions on the relative merits of the proposed techniques, as they have been evaluated in isolation using different methodologies. In this paper, we make two concrete contributions to address this issue. First, we propose a benchmark for evaluating the performance of an online tuning algorithm in a principled fashion. Second, using the benchmark, we present a comparison of two representative online tuning algorithms that are implemented in the same database system. The results provide interesting insights on the behavior of these algorithms and validate the usefulness of the proposed benchmark. I
    corecore