63 research outputs found
Visibly Pushdown Transducers with Well-nested Outputs
Visibly pushdown transducers (VPTs) are visibly pushdown automata extended with outputs. They have been introduced to model transformations of nested words, i.e. words with a call/return structure. When outputs are also structured and well nested words, VPTs are a natural formalism to express tree transformations evaluated in streaming. We prove the class of VPTs with well-nested outputs to be decidable in PTIME. Moreover, we show that this class is closed under composition and that its type-checking against visibly pushdown languages is decidable
On Functionality of Visibly Pushdown Transducers
Visibly pushdown transducers form a subclass of pushdown transducers that
(strictly) extends finite state transducers with a stack. Like visibly pushdown
automata, the input symbols determine the stack operations. In this paper, we
prove that functionality is decidable in PSpace for visibly pushdown
transducers. The proof is done via a pumping argument: if a word with two
outputs has a sufficiently large nesting depth, there exists a nested word with
two outputs whose nesting depth is strictly smaller. The proof uses technics of
word combinatorics. As a consequence of decidability of functionality, we also
show that equivalence of functional visibly pushdown transducers is
Exptime-Complete.Comment: 20 page
Streamability of nested word transductions
We consider the problem of evaluating in streaming (i.e., in a single
left-to-right pass) a nested word transduction with a limited amount of memory.
A transduction T is said to be height bounded memory (HBM) if it can be
evaluated with a memory that depends only on the size of T and on the height of
the input word. We show that it is decidable in coNPTime for a nested word
transduction defined by a visibly pushdown transducer (VPT), if it is HBM. In
this case, the required amount of memory may depend exponentially on the height
of the word. We exhibit a sufficient, decidable condition for a VPT to be
evaluated with a memory that depends quadratically on the height of the word.
This condition defines a class of transductions that strictly contains all
determinizable VPTs
Two-Way Visibly Pushdown Automata and Transducers
Automata-logic connections are pillars of the theory of regular languages.
Such connections are harder to obtain for transducers, but important results
have been obtained recently for word-to-word transformations, showing that the
three following models are equivalent: deterministic two-way transducers,
monadic second-order (MSO) transducers, and deterministic one-way automata
equipped with a finite number of registers. Nested words are words with a
nesting structure, allowing to model unranked trees as their depth-first-search
linearisations. In this paper, we consider transformations from nested words to
words, allowing in particular to produce unranked trees if output words have a
nesting structure. The model of visibly pushdown transducers allows to describe
such transformations, and we propose a simple deterministic extension of this
model with two-way moves that has the following properties: i) it is a simple
computational model, that naturally has a good evaluation complexity; ii) it is
expressive: it subsumes nested word-to-word MSO transducers, and the exact
expressiveness of MSO transducers is recovered using a simple syntactic
restriction; iii) it has good algorithmic/closure properties: the model is
closed under composition with a unambiguous one-way letter-to-letter transducer
which gives closure under regular look-around, and has a decidable equivalence
problem
Streaming Tree Transducers
Theory of tree transducers provides a foundation for understanding
expressiveness and complexity of analysis problems for specification languages
for transforming hierarchically structured data such as XML documents. We
introduce streaming tree transducers as an analyzable, executable, and
expressive model for transforming unranked ordered trees in a single pass.
Given a linear encoding of the input tree, the transducer makes a single
left-to-right pass through the input, and computes the output in linear time
using a finite-state control, a visibly pushdown stack, and a finite number of
variables that store output chunks that can be combined using the operations of
string-concatenation and tree-insertion. We prove that the expressiveness of
the model coincides with transductions definable using monadic second-order
logic (MSO). Existing models of tree transducers either cannot implement all
MSO-definable transformations, or require regular look ahead that prohibits
single-pass implementation. We show a variety of analysis problems such as
type-checking and checking functional equivalence are solvable for our model.Comment: 40 page
Transducer-Based Rewriting Games for Active XML
Context-free games are two-player rewriting games that are played on nested
strings representing XML documents with embedded function symbols. These games
were introduced to model rewriting processes for intensional documents in the
Active XML framework, where input documents are to be rewritten into a given
target schema by calls to external services.
This paper studies the setting where dependencies between inputs and outputs
of service calls are modelled by transducers, which has not been examined
previously. It defines transducer models operating on nested words and studies
their properties, as well as the computational complexity of the winning
problem for transducer-based context-free games in several scenarios. While the
complexity of this problem is quite high in most settings (ranging from
NP-complete to undecidable), some tractable restrictions are also identified.Comment: Extended version of MFCS 2016 conference pape
Streaming Enumeration on Nested Documents
Some of the most relevant document schemas used online, such as XML and JSON, have a nested format. In the last decade, the task of extracting data from nested documents over streams has become especially relevant. We focus on the streaming evaluation of queries with outputs of varied sizes over nested documents. We model queries of this kind as Visibly Pushdown Transducers (VPT), a computational model that extends visibly pushdown automata with outputs and has the same expressive power as MSO over nested documents. Since processing a document through a VPT can generate a massive number of results, we are interested in reading the input in a streaming fashion and enumerating the outputs one after another as efficiently as possible, namely, with constant-delay. This paper presents an algorithm that enumerates these elements with constant-delay after processing the document stream in a single pass. Furthermore, we show that this algorithm is worst-case optimal in terms of update-time per symbol and memory usage
Programming Using Automata and Transducers
Automata, the simplest model of computation, have proven to be an effective tool in reasoning about programs that operate over strings. Transducers augment automata to produce outputs and have been used to model string and tree transformations such as natural language translations. The success of these models is primarily due to their closure properties and decidable procedures, but good properties come at the price of limited expressiveness. Concretely, most models only support finite alphabets and can only represent small classes of languages and transformations. We focus on addressing these limitations and bridge the gap between the theory of automata and transducers and complex real-world applications: Can we extend automata and transducer models to operate over structured and infinite alphabets? Can we design languages that hide the complexity of these formalisms? Can we define executable models that can process the input efficiently? First, we introduce succinct models of transducers that can operate over large alphabets and design BEX, a language for analysing string coders. We use BEX to prove the correctness of UTF and BASE64 encoders and decoders. Next, we develop a theory of tree transducers over infinite alphabets and design FAST, a language for analysing tree-manipulating programs. We use FAST to detect vulnerabilities in HTML sanitizers, check whether augmented reality taggers conflict, and optimize and analyze functional programs that operate over lists and trees. Finally, we focus on laying the foundations of stream processing of hierarchical data such as XML files and program traces. We introduce two new efficient and executable models that can process the input in a left-to-right linear pass: symbolic visibly pushdown automata and streaming tree transducers. Symbolic visibly pushdown automata are closed under Boolean operations and can specify and efficiently monitor complex properties for hierarchical structures over infinite alphabets. Streaming tree transducers can express and efficiently process complex XML transformations while enjoying decidable procedures
Equivalence of Deterministic Nested Word to Word Transducers
International audienceWe study the equivalence problem of deterministic nested word to word transducers and show it to be surprisingly robust. Modulo polynomial time reductions, it can be identified with 4 equivalence problems for diverse classes of deterministic non-copying order-preserving transducers. In particular, we present polynomial time back and fourth reductions to the morphism equivalence problem on context free languages, which is known to be solvable in polynomial time
- …