110 research outputs found

    Relational Approach to Logical Query Optimization of XPath

    Get PDF
    To be able to handle the ever growing volumes of XML documents, effective and efficient data management solutions are needed. Managing XML data in a relational DBMS has great potential. Recently, effective relational storage schemes and index structures have been proposed as well as special-purpose join operators to speed up querying of XML data using XPath/XQuery. In this paper, we address the topic of query plan construction and logical query optimization. The claim of this paper is that standard relational algebra extended with special-purpose join operators suffices for logical query optimization. We focus on the XPath accelerator storage scheme and associated staircase join operators, but the approach can be generalized easily

    Staircase Join: Teach a Relational DBMS to Watch its (Axis) Steps

    Get PDF
    Relational query processors derive much of their effectiveness from the awareness of specific table properties like sort order, size, or absence of duplicate tuples. This text applies (and adapts) this successful principle to database-supported XML and XPath processing: the relational system is made tree aware, i.e., tree properties like subtree size, intersection of paths, inclusion or disjointness of subtrees are made explicit. We propose a local change to the database kernel, the staircase join, which encapsulates the necessary tree knowledge needed to improve XPath performance. Staircase join operates on an XML encoding which makes this knowledge available at the cost of simple integer operations (e.g., +, <=). We finally report on quite promising experiments with a staircase join enhanced main-memory database kernel

    Pathfinder: XQuery - The Relational Way

    Get PDF
    Relational query processors are probably the best understood (as well as the best engineered) query engines available today. Although carefully tuned to process instances of the relational model (tables of tuples), these processors can also provide a foundation for the evaluation of "alien" (non-relational) query languages: if a relational encoding of the alien data model and its associated query language is given, the RDBMS may act like a special-purpose processor for the new language

    An Injection with Tree Awareness: Adding Staircase Join to PostgreSQL

    Get PDF
    The syntactic wellformedness constraints of XML (opening and closing tags nest properly) imply that XML processors face the challenge to efficiently handle data that takes the shape of ordered, unranked trees. Although RDBMSs have originally been designed to manage table-shaped data, we propose their use as XML and XPath processors. In our setup, the database system employs a relational XML document encoding, the XPath accelerator [1], which maps information about the XML node hierarchy to a table, thus making it possible to evaluate XPath expressions on SQL hosts.\ud \ud Conventional RDBMSs, nevertheless, remain ignorant of many interesting properties of the encoded tree data, and were thus found to make no or poor use of these properties. This is why we devised a new join algorithm, staircase join [2], which incorporates the tree-specific knowledge required for an efficient SQL-based evaluation of XPath expressions. In a sense, this demonstration delivers the promise we have made at VLDB 2003 [2]: a notion of tree awareness can be injected into a conventional disk-based RDBMS kernel in terms of staircase join. The demonstration features a side-by-side comparison of both, an original and a staircase-join enhanced instance of PostgreSQL [4]. The required changes to\ud PostgreSQL were local, the achieved eect, however, is significant: the demonstration proves that this injection of tree awareness turns PostgreSQL into a high-performance XML processor that closely adheres to the XPath semantics

    MonetDB/XQuery: a fast XQuery processor powered by a relational engine

    Get PDF
    Relational XQuery systems try to re-use mature relational data management infrastructures to create fast and scalable XML database technology. This paper describes the main features, key contributions, and lessons learned while implementing such a system. Its architecture consists of (i) a range-based encoding of XML documents into relational tables, (ii) a compilation technique that translates XQuery into a basic relational algebra, (iii) a restricted (order) property-aware peephole relational query optimization strategy, and (iv) a mapping from XML update statements into relational updates. Thus, this system implements all essential XML database functionalities (rather than a single feature) such that we can learn from the full consequences of our architectural decisions. While implementing this system, we had to extend the state-of-the-art with a number of new technical contributions, such as loop-lifted staircase join and efficient relational query evaluation strategies for XQuery theta-joins with existential semantics. These contributions as well as the architectural lessons learned are also deemed valuable for other relational back-end engines. The performance and scalability of the resulting system is evaluated on the XMark benchmark up to data sizes of 11GB. The performance section also provides an extensive benchmark comparison of all major XMark results published previously, which confirm that the goal of purely relational XQuery processing, namely speed and scalability, was met

    MonetDB/XQuery - Consistent & Efficient Updates on the Pre/Post Plane

    Get PDF
    Relational XQuery processors aim at leveraging mature relational DBMS query processing technology to provide scalability and efficiency. To achieve this goal, various storage schemes have been proposed to encode the tree structure of XML documents in flat relational tables. Basically, two classes can be identified: (1) encodings using fixed-length surrogates, like the preorder ranks in the pre/post encoding [5] or the equivalent pre/size/level encoding [8], and (2) encodings using variable-length surrogates, like, e.g., ORDPATH [9] or P-PBiTree [12]. Recent research [1] showed a clear advantage of the former for efficient evaluation of XPath location steps, exploiting techniques like cheap node order tests, positional lookup, and node skipping in staircase join [7]. However, once updates are involved, variable-length surrogates are often considered the better choice, mainly as a straightforward implementation of structural XML updates using fixed-length surrogates faces two performance bottlenecks: (i) high physical cost (the preorder ranks of all nodes following the update position must be modified—on average 50% of the document), and (ii) low transaction concurrency (updating the size of all ancestor nodes causes lock contention on the document root)

    The relational XQuery puzzle: a look-back on the pieces found so far

    Get PDF
    Given the tremendous versatility of relational database implementations toward awide range of database problems, it seems only natural to consider them as back-ends for XML data processing. Yet, the assumptions behind the language XQuery are considerably different to those in traditional RDBMSs. The underlying data model is a tree, data and results carry an intrinsic order, queries are described using explicit iteration and, after all, problems are everything else but regular. Solving the relational XQuery puzzle, therefore, has challenged anumber of research groups over the past years. The purpose of this article is to summarize and assess some of the results that have been obtained during this period to solve the puzzle. Our main focus is on the Pathfinder XQuery compiler, afull reference implementation of apurely relational XQuery processor. As we dissect its components, we relate them to other work in the field and also point to open problems and limitations in the context of relational XQuery processin

    On Region Algebras, XML Databases, and Information Retrieval

    Get PDF
    This paper describes some new ideas on developing a logical algebra for databases that manage textual data and support information retrieval functionality. We describe a first prototype of such a system

    Automatic mapping of XML documents into relational database

    Get PDF
    Extensible Markup Language (XML) nowadays is one of the most important standard media used for exchanging and representing data through the Internet. Storing, updating and retrieving the huge amount of web services data such as XML is an attractive area of research for researchers and database vendors. In this thesis, we propose and develop a new mapping model, called MAXDOR, for storing, rebuilding, updating and querying XML documents using a relational database without making use of any XML schemas in the mapping process. The model addressed the problem of solving the structural hole between ordered hierarchical XML and unordered tabular relational database to enable us to use relational database systems for storing, updating and querying XML data. A multiple link list is used to maintain XML document structure, manage the process of updating document contents and retrieve document contents efficiently. Experiments are done to evaluate MAXDOR model. MAXDOR will be compared with other well-known models available in the literature(Tatarinov et al., 2002) and (Torsten et al., 2004) using total expected value of rebuilding XML document execution time and insertion of token execution time.EThOS - Electronic Theses Online ServiceGBUnited Kingdo
    corecore