11,404 research outputs found

    Towards Practical Typechecking for Macro Tree Transducers

    Get PDF
    Macro tree transducers (mtt) are an important model that both covers many useful XML transformations and allows decidable exact typechecking. This paper reports our first step toward an implementation of mtt typechecker that has a practical efficiency. Our approach is to represent an input type obtained from a backward inference as an alternating tree automaton, in a style similar to Tozawa's XSLT0 typechecking. In this approach, typechecking reduces to checking emptiness of an alternating tree automaton. We propose several optimizations (Cartesian factorization, state partitioning) on the backward inference process in order to produce much smaller alternating tree automata than the naive algorithm, and we present our efficient algorithm for checking emptiness of alternating tree automata, where we exploit the explicit representation of alternation for local optimizations. Our preliminary experiments confirm that our algorithm has a practical performance that can typecheck simple transformations with respect to the full XHTML in a reasonable time

    Streaming Tree Transducers

    Get PDF
    Theory of tree transducers provides a foundation for understanding expressiveness and complexity of analysis problems for specification languages for transforming hierarchically structured data such as XML documents. We introduce streaming tree transducers as an analyzable, executable, and expressive model for transforming unranked ordered trees in a single pass. Given a linear encoding of the input tree, the transducer makes a single left-to-right pass through the input, and computes the output in linear time using a finite-state control, a visibly pushdown stack, and a finite number of variables that store output chunks that can be combined using the operations of string-concatenation and tree-insertion. We prove that the expressiveness of the model coincides with transductions definable using monadic second-order logic (MSO). Existing models of tree transducers either cannot implement all MSO-definable transformations, or require regular look ahead that prohibits single-pass implementation. We show a variety of analysis problems such as type-checking and checking functional equivalence are solvable for our model.Comment: 40 page

    Unranked Tree Rewriting and Effective Closures of Languages

    Get PDF
    International audienceWe consider rewriting systems for unranked ordered trees, where the number of chil- dren of a node is not determined by its label, and is not a priori bounded. The rewriting systems are defined such that variables in the rewrite rules can be substituted by hedges (sequences of trees) instead of just trees. Consequently, this notion of rewriting subsumes both standard term rewriting and word rewriting.We present some properties of preservation for classes of unranked tree languages, including hedge automata languages and various context-free extensions. Finally, ap- plications to static type checking for XML transformations and to the verification of read/write access control policies for XML updates are mentioned

    XRound : A reversible template language and its application in model-based security analysis

    Get PDF
    Successful analysis of the models used in Model-Driven Development requires the ability to synthesise the results of analysis and automatically integrate these results with the models themselves. This paper presents a reversible template language called XRound which supports round-trip transformations between models and the logic used to encode system properties. A template processor that supports the language is described, and the use of the template language is illustrated by its application in an analysis workbench, designed to support analysis of security properties of UML and MOF-based models. As a result of using reversible templates, it is possible to seamlessly and automatically integrate the results of a security analysis with a model. (C) 2008 Elsevier B.V. All rights reserved

    XQuery Streaming by Forest Transducers

    Full text link
    Streaming of XML transformations is a challenging task and only very few systems support streaming. Research approaches generally define custom fragments of XQuery and XPath that are amenable to streaming, and then design custom algorithms for each fragment. These languages have several shortcomings. Here we take a more principles approach to the problem of streaming XQuery-based transformations. We start with an elegant transducer model for which many static analysis problems are well-understood: the Macro Forest Transducer (MFT). We show that a large fragment of XQuery can be translated into MFTs --- indeed, a fragment of XQuery, that can express important features that are missing from other XQuery stream engines, such as GCX: our fragment of XQuery supports XPath predicates and let-statements. We then rely on a streaming execution engine for MFTs, one which uses a well-founded set of optimizations from functional programming, such as strictness analysis and deforestation. Our prototype achieves time and memory efficiency comparable to the fastest known engine for XQuery streaming, GCX. This is surprising because our engine relies on the OCaml built in garbage collector and does not use any specialized buffer management, while GCX's efficiency is due to clever and explicit buffer management.Comment: Full version of the paper in the Proceedings of the 30th IEEE International Conference on Data Engineering (ICDE 2014

    Rewrite based Verification of XML Updates

    Get PDF
    We consider problems of access control for update of XML documents. In the context of XML programming, types can be viewed as hedge automata, and static type checking amounts to verify that a program always converts valid source documents into also valid output documents. Given a set of update operations we are particularly interested by checking safety properties such as preservation of document types along any sequence of updates. We are also interested by the related policy consistency problem, that is detecting whether a sequence of authorized operations can simulate a forbidden one. We reduce these questions to type checking problems, solved by computing variants of hedge automata characterizing the set of ancestors and descendants of the initial document type for the closure of parameterized rewrite rules

    Mapping and Displaying Structural Transformations between XML and PDF

    Get PDF
    Documents are often marked up in XML-based tagsets to delineate major structural components such as headings, paragraphs, figure captions and so on, without much regard to their eventual displayed appearance. And yet these same abstract documents, after many transformations and 'typesetting' processes, often emerge in the popular format of Adobe PDF, either for dissemination or archiving. Until recently PDF has been a totally display-based document representation, relying on the underlying PostScript semantics of PDF. Early versions of PDF had no mechanism for retaining any form of abstract document structure but recent releases have now introduced an internal structure tree to create the so called 'Tagged PDF'. This paper describes the development of a plugin for Adobe Acrobat which creates a two-window display. In one window is shown an XML document original and in the other its Tagged PDF counterpart is seen, with an internal structure tree that, in some sense, matches the one seen in XML. If a component is highlighted in either window then the corresponding structured item, with any attendant text, is also highlighted in the other window. Important applications of correctly Tagged PDF include making PDF documents reflow intelligently on small screen devices and enabling them to be read out in correct reading order, via speech synthesiser software, for the visually impaired. By tracing structure transformation from source document to destination one can implement the repair of damaged PDF structure or the adaptation of an existing structure tree to an incrementally updated document

    Recovering Grammar Relationships for the Java Language Specification

    Get PDF
    Grammar convergence is a method that helps discovering relationships between different grammars of the same language or different language versions. The key element of the method is the operational, transformation-based representation of those relationships. Given input grammars for convergence, they are transformed until they are structurally equal. The transformations are composed from primitive operators; properties of these operators and the composed chains provide quantitative and qualitative insight into the relationships between the grammars at hand. We describe a refined method for grammar convergence, and we use it in a major study, where we recover the relationships between all the grammars that occur in the different versions of the Java Language Specification (JLS). The relationships are represented as grammar transformation chains that capture all accidental or intended differences between the JLS grammars. This method is mechanized and driven by nominal and structural differences between pairs of grammars that are subject to asymmetric, binary convergence steps. We present the underlying operator suite for grammar transformation in detail, and we illustrate the suite with many examples of transformations on the JLS grammars. We also describe the extraction effort, which was needed to make the JLS grammars amenable to automated processing. We include substantial metadata about the convergence process for the JLS so that the effort becomes reproducible and transparent
    • …
    corecore