11 research outputs found

    Regular Expression Subtyping for XML Query and Update Languages

    Full text link
    XML database query languages such as XQuery employ regular expression types with structural subtyping. Subtyping systems typically have two presentations, which should be equivalent: a declarative version in which the subsumption rule may be used anywhere, and an algorithmic version in which the use of subsumption is limited in order to make typechecking syntax-directed and decidable. However, the XQuery standard type system circumvents this issue by using imprecise typing rules for iteration constructs and defining only algorithmic typechecking, and another extant proposal provides more precise types for iteration constructs but ignores subtyping. In this paper, we consider a core XQuery-like language with a subsumption rule and prove the completeness of algorithmic typechecking; this is straightforward for XQuery proper but requires some care in the presence of more precise iteration typing disciplines. We extend this result to an XML update language we have introduced in earlier work.Comment: ESOP 2008. Companion technical report with proof

    Comprehending Ringads for Phil Wadler, on the occasion of his 60th birthday

    Get PDF
    Abstract. List comprehensions are a widely used programming construct, in languages such as Haskell and Python and in technologies such as Microsoft's Language Integrated Query. They generalize from lists to arbitrary monads, yielding a lightweight idiom of imperative programming in a pure functional language. When the monad has the additional structure of a so-called ringad, corresponding to 'empty' and 'union' operations, then it can be seen as some kind of collection type, and the comprehension notation can also be extended to incorporate aggregations. Ringad comprehensions represent a convenient notation for expressing database queries. The ringad structure alone does not provide a good explanation or an efficient implementation of relational joins; but by allowing heterogeneous comprehensions, involving both bag and indexed table ringads, we show how to accommodate these too

    Relational Algebra by Way of Adjunctions

    Get PDF
    Bulk types such as sets, bags, and lists are monads, and therefore support a notation for database queries based on comprehensions. This fact is the basis of much work on database query languages. The monadic structure easily explains most of standard relational algebra—specifically, selections and projections—allowing for an elegant mathematical foundation for those aspects of database query language design. Most, but not all: monads do not immediately offer an explanation of relational join or grouping, and hence important foundations for those crucial aspects of relational algebra are missing. The best they can offer is cartesian product followed by selection. Adjunctions come to the rescue: like any monad, bulk types also arise from certain adjunctions; we show that by paying due attention to other important adjunctions, we can elegantly explain the rest of standard relational algebra. In particular, graded monads provide a mathematical foundation for indexing and grouping, which leads directly to an efficient implementation, even of joins

    Efficient Semi-structured Queries in Scala using XQuery Shipping

    Get PDF
    This project proposes a new approach to interact with database systems through programming languages. A formal query language can be integrated within modern programming languages and the semi-structured queries can be evaluated using automatic transformation and query shipping. The focus of this project is on XML queries and Scala programming language. Particularly, this project optimizes the XML-based expressions of Scala using XQuery transformation and Shipping. In this work, Scala sequence comprehensions are extended to cover appropriately the whole functionalities of XQuery FLWOR expressions and XQuery sequence comparisons are introduced in Scala to facilitate query generation. This report presents a formalization of transformation rules between Scala and XQuery languages and describes an Scala implementation. Various use cases are provided to facilitate understanding and employing this newest Scala library

    Efficient main memory-based XML stream processing

    Get PDF
    Applications that process XML documents as files or streams are naturally main-memory based. This makes main memory the bottleneck for scalability. This doctoral thesis addresses this problem and presents a toolkit for effective buffer management in main memory-based XML stream processors. XML document projection is an established technique for reducing the buffer requirements of main memory-based XML processors, where only data relevant to query evaluation is loaded into main memory buffers. We present a novel implementation of this task, where we use string matching algorithms designed for efficient keyword search in flat strings to navigate in tree-structured data. We then introduce an extension of the XQuery language, called FluX, that supports event-based query processing. Purely event-based queries of this language can be executed on streaming XML data in a very direct way. We develop an algorithm to efficiently rewrite XQueries into FluX. This algorithm is capable of exploiting order constraints derived from schemata to reduce the amount of buffering in query evaluation. During streaming query evaluation, we continuously purge buffers from data that is no longer relevant. By combining static query analysis with a dynamic analysis of the buffer contents, we effectively reduce the size of memory buffers. We have confirmed the efficacy of these techniques by extensive experiments and by publication at international venues. To compare our contributions to related work in a systematic manner, we contribute an abstract framework for XML stream processing. This framework allows us to gain a greater-picture view over the factors influencing the main memory consumption.Anwendungen, die XML-Dokumente als Dateien oder Ströme verarbeiten, sind natĂŒrlicherweise hauptspeicherbasiert. FĂŒr die Skalierbarkeit wird der Hauptspeicher damit zu einem Engpass. Diese Doktorarbeit widmet sich diesem Problem, zu dessen Lösung sie Werkzeuge fĂŒr eine effektive Pufferverwaltung in hauptspeicherbasierten Prozessoren fĂŒr XML-Datenströme vorstellt. Die Projektion von XML-Dokumenten ist eine etablierte Methode, um den Pufferverbrauch von hauptspeicherbasierten XML-Prozessoren zu reduzieren. Dabei werden nur jene Daten in den Hauptspeicherpuffer geladen, die fĂŒr die Anfrageauswertung auch relevant sind. Wir prĂ€sentieren eine neue Implementierung dieser Aufgabe, wobei wir Algorithmen zur effizienten Suche in flachen Zeichenketten einsetzen, um in baumartig strukturierten Daten zu navigieren. Danach stellen wir eine Erweiterung der XQuery-Sprache vor, genannt FluX, welche eine ereignisbasierte Anfragebearbeitung erlaubt. Anfragen, die nur ereignisbasierte Konstrukte benutzen, können direkt ĂŒber XML-Datenströmen ausgewertet werden. Dazu entwickeln wir einen Algorithmus, mit dessen Hilfe sich XQuery-Anfragen effizient in FluX ĂŒbersetzen lassen. Dieser benutzt Ordnungsinformationen aus Datenschemata, womit das Puffern in der Anfragebearbeitung reduziert werden kann. WĂ€hrend der Verarbeitung des Datenstroms bereinigen wir laufend den Hauptspeicherpuffer von solchen Daten, die nicht lĂ€nger relevant sind. Eine nachhaltige Reduzierung der GrĂ¶ĂŸe von Hauptspeicherpuffern gelingt durch die Kombination der statischen Anfrageanalyse mit einer dynamischen Analyse der Pufferinhalte. Die EffektivitĂ€t dieser Puffermanagement-Techniken erfĂ€hrt ihre BestĂ€tigung in umfangreichen Experimenten und internationalen Publikationen. FĂŒr einen systematischen Vergleich unserer BeitrĂ€ge mit der aktuellen Literatur entwickeln wir ein abstraktes System zur Modellierung von Prozessoren zur XML-Stromverarbeitung. So können wir die spezifischen Faktoren herausgreifen, die den Hauptspeicherverbrauch beeinflussen
    corecore