33,957 research outputs found

    TOPYDE: A Tool for Physical Database Design

    Get PDF
    We describe a tool for physical database design based on a combination of theoretical and pragmatic approaches. The tool takes as input a relational schema, the workload defined on the schema, and some additional database characteristics and produces as output a physical schema. For the time being, the tool is tuned towards Ingres

    On the selection of secondary indices in relational databases

    Get PDF
    An important problem in the physical design of databases is the selection of secondary indices. In general, this problem cannot be solved in an optimal way due to the complexity of the selection process. Often use is made of heuristics such as the well-known ADD and DROP algorithms. In this paper it will be shown that frequently used cost functions can be classified as super- or submodular functions. For these functions several mathematical properties have been derived which reduce the complexity of the index selection problem. These properties will be used to develop a tool for physical database design and also give a mathematical foundation for the success of the before-mentioned ADD and DROP algorithms

    Old Techniques for New Join Algorithms: A Case Study in RDF Processing

    Full text link
    Recently there has been significant interest around designing specialized RDF engines, as traditional query processing mechanisms incur orders of magnitude performance gaps on many RDF workloads. At the same time researchers have released new worst-case optimal join algorithms which can be asymptotically better than the join algorithms in traditional engines. In this paper we apply worst-case optimal join algorithms to a standard RDF workload, the LUBM benchmark, for the first time. We do so using two worst-case optimal engines: (1) LogicBlox, a commercial database engine, and (2) EmptyHeaded, our prototype research engine with enhanced worst-case optimal join algorithms. We show that without any added optimizations both LogicBlox and EmptyHeaded outperform two state-of-the-art specialized RDF engines, RDF-3X and TripleBit, by up to 6x on cyclic join queries-the queries where traditional optimizers are suboptimal. On the remaining, less complex queries in the LUBM benchmark, we show that three classic query optimization techniques enable EmptyHeaded to compete with RDF engines, even when there is no asymptotic advantage to the worst-case optimal approach. We validate that our design has merit as EmptyHeaded outperforms MonetDB by three orders of magnitude and LogicBlox by two orders of magnitude, while remaining within an order of magnitude of RDF-3X and TripleBit

    Staircase Join: Teach a Relational DBMS to Watch its (Axis) Steps

    Get PDF
    Relational query processors derive much of their effectiveness from the awareness of specific table properties like sort order, size, or absence of duplicate tuples. This text applies (and adapts) this successful principle to database-supported XML and XPath processing: the relational system is made tree aware, i.e., tree properties like subtree size, intersection of paths, inclusion or disjointness of subtrees are made explicit. We propose a local change to the database kernel, the staircase join, which encapsulates the necessary tree knowledge needed to improve XPath performance. Staircase join operates on an XML encoding which makes this knowledge available at the cost of simple integer operations (e.g., +, <=). We finally report on quite promising experiments with a staircase join enhanced main-memory database kernel

    MonetDB/XQuery: a fast XQuery processor powered by a relational engine

    Get PDF
    Relational XQuery systems try to re-use mature relational data management infrastructures to create fast and scalable XML database technology. This paper describes the main features, key contributions, and lessons learned while implementing such a system. Its architecture consists of (i) a range-based encoding of XML documents into relational tables, (ii) a compilation technique that translates XQuery into a basic relational algebra, (iii) a restricted (order) property-aware peephole relational query optimization strategy, and (iv) a mapping from XML update statements into relational updates. Thus, this system implements all essential XML database functionalities (rather than a single feature) such that we can learn from the full consequences of our architectural decisions. While implementing this system, we had to extend the state-of-the-art with a number of new technical contributions, such as loop-lifted staircase join and efficient relational query evaluation strategies for XQuery theta-joins with existential semantics. These contributions as well as the architectural lessons learned are also deemed valuable for other relational back-end engines. The performance and scalability of the resulting system is evaluated on the XMark benchmark up to data sizes of 11GB. The performance section also provides an extensive benchmark comparison of all major XMark results published previously, which confirm that the goal of purely relational XQuery processing, namely speed and scalability, was met
    corecore