48 research outputs found
Streaming Tree Transducers
Theory of tree transducers provides a foundation for understanding
expressiveness and complexity of analysis problems for specification languages
for transforming hierarchically structured data such as XML documents. We
introduce streaming tree transducers as an analyzable, executable, and
expressive model for transforming unranked ordered trees in a single pass.
Given a linear encoding of the input tree, the transducer makes a single
left-to-right pass through the input, and computes the output in linear time
using a finite-state control, a visibly pushdown stack, and a finite number of
variables that store output chunks that can be combined using the operations of
string-concatenation and tree-insertion. We prove that the expressiveness of
the model coincides with transductions definable using monadic second-order
logic (MSO). Existing models of tree transducers either cannot implement all
MSO-definable transformations, or require regular look ahead that prohibits
single-pass implementation. We show a variety of analysis problems such as
type-checking and checking functional equivalence are solvable for our model.Comment: 40 page
Reducing Transducer Equivalence to Register Automata Problems Solved by "Hilbert Method"
In the past decades, classical results from algebra, including Hilbert\u27s Basis Theorem, had various applications in formal languages, including a proof of the Ehrenfeucht Conjecture, decidability of HDT0L sequence equivalence, and decidability of the equivalence problem for functional tree-to-string transducers.
In this paper, we study the scope of the algebraic methods mentioned above, particularily as applied to the functionality problem for register automata, and equivalence for functional register automata. We provide two results, one positive, one negative. The positive result is that functionality and equivalence are decidable for MSO transformations on unordered forests. The negative result comes from a try to extend this method to decide functionality and equivalence on macro tree transducers. We reduce macro tree transducers equivalence to an equivalence problem for some class of register automata naturally relevant to our method. We then prove this latter problem to be undecidable
Programming Using Automata and Transducers
Automata, the simplest model of computation, have proven to be an effective tool in reasoning about programs that operate over strings. Transducers augment automata to produce outputs and have been used to model string and tree transformations such as natural language translations. The success of these models is primarily due to their closure properties and decidable procedures, but good properties come at the price of limited expressiveness. Concretely, most models only support finite alphabets and can only represent small classes of languages and transformations. We focus on addressing these limitations and bridge the gap between the theory of automata and transducers and complex real-world applications: Can we extend automata and transducer models to operate over structured and infinite alphabets? Can we design languages that hide the complexity of these formalisms? Can we define executable models that can process the input efficiently? First, we introduce succinct models of transducers that can operate over large alphabets and design BEX, a language for analysing string coders. We use BEX to prove the correctness of UTF and BASE64 encoders and decoders. Next, we develop a theory of tree transducers over infinite alphabets and design FAST, a language for analysing tree-manipulating programs. We use FAST to detect vulnerabilities in HTML sanitizers, check whether augmented reality taggers conflict, and optimize and analyze functional programs that operate over lists and trees. Finally, we focus on laying the foundations of stream processing of hierarchical data such as XML files and program traces. We introduce two new efficient and executable models that can process the input in a left-to-right linear pass: symbolic visibly pushdown automata and streaming tree transducers. Symbolic visibly pushdown automata are closed under Boolean operations and can specify and efficiently monitor complex properties for hierarchical structures over infinite alphabets. Streaming tree transducers can express and efficiently process complex XML transformations while enjoying decidable procedures
Linear High-Order Deterministic Tree Transducers with Regular Look-Ahead
We introduce the notion of high-order deterministic top-down tree transducers (HODT) whose outputs correspond to single-typed lambda-calculus formulas. These transducers are natural generalizations of known models of top-tree transducers such as: Deterministic Top-Down Tree Transducers, Macro Tree Transducers, Streaming Tree Transducers... We focus on the linear restriction of high order tree transducers with look-ahead (HODTR_lin), and prove this corresponds to tree to tree functional transformations defined by Monadic Second Order (MSO) logic. We give a specialized procedure for the composition of those transducers that uses a flow analysis based on coherence spaces and allows us to preserve the linearity of transducers. This procedure has a better complexity than classical algorithms for composition of other equivalent tree transducers, but raises the order of transducers. However, we also indicate that the order of a HODTR_lin can always be bounded by 3, and give a procedure that reduces the order of a HODTR_lin to 3. As those resulting HODTR_lin can then be transformed into other equivalent models, this gives an important insight on composition algorithm for other classes of transducers. Finally, we prove that those results partially translate to the case of almost linear HODTR: the class corresponds to the class of tree transformations performed by MSO with unfolding (not closed by composition), and provide a mechanism to reduce the order to 3 in this case
Implicit automata in typed -calculi II: streaming transducers vs categorical semantics
We characterize regular string transductions as programs in a linear
-calculus with additives. One direction of this equivalence is proved
by encoding copyless streaming string transducers (SSTs), which compute regular
functions, into our -calculus. For the converse, we consider a
categorical framework for defining automata and transducers over words, which
allows us to relate register updates in SSTs to the semantics of the linear
-calculus in a suitable monoidal closed category. To illustrate the
relevance of monoidal closure to automata theory, we also leverage this notion
to give abstract generalizations of the arguments showing that copyless SSTs
may be determinized and that the composition of two regular functions may be
implemented by a copyless SST. Our main result is then generalized from strings
to trees using a similar approach. In doing so, we exhibit a connection between
a feature of streaming tree transducers and the multiplicative/additive
distinction of linear logic.
Keywords: MSO transductions, implicit complexity, Dialectica categories,
Church encodingsComment: 105 pages, 24 figure
Recommended from our members
Which Classes of Origin Graphs Are Generated by Transducers.
We study various models of transducers equipped with origin information. We consider the semantics of these models as particular graphs, called origin graphs, and we characterise the families of such graphs recognised by streaming string transducers