1,412 research outputs found
Tree transducers, L systems, and two-way machines
A relationship between parallel rewriting systems and two-way machines is investigated. Restrictions on the “copying power” of these devices endow them with rich structuring and give insight into the issues of determinism, parallelism, and copying. Among the parallel rewriting systems considered are the top-down tree transducer; the generalized syntax-directed translation scheme and the ETOL system, and among the two-way machines are the tree-walking automaton, the two-way finite-state transducer, and (generalizations of) the one-way checking stack automaton. The. relationship of these devices to macro grammars is also considered. An effort is made .to provide a systematic survey of a number of existing results
Linear Bounded Composition of Tree-Walking Tree Transducers: Linear Size Increase and Complexity
Compositions of tree-walking tree transducers form a hierarchy with respect
to the number of transducers in the composition. As main technical result it is
proved that any such composition can be realized as a linear bounded
composition, which means that the sizes of the intermediate results can be
chosen to be at most linear in the size of the output tree. This has
consequences for the expressiveness and complexity of the translations in the
hierarchy. First, if the computed translation is a function of linear size
increase, i.e., the size of the output tree is at most linear in the size of
the input tree, then it can be realized by just one, deterministic,
tree-walking tree transducer. For compositions of deterministic transducers it
is decidable whether or not the translation is of linear size increase. Second,
every composition of deterministic transducers can be computed in deterministic
linear time on a RAM and in deterministic linear space on a Turing machine,
measured in the sum of the sizes of the input and output tree. Similarly, every
composition of nondeterministic transducers can be computed in simultaneous
polynomial time and linear space on a nondeterministic Turing machine. Their
output tree languages are deterministic context-sensitive, i.e., can be
recognized in deterministic linear space on a Turing machine. The membership
problem for compositions of nondeterministic translations is nondeterministic
polynomial time and deterministic linear space. The membership problem for the
composition of a nondeterministic and a deterministic tree-walking tree
translation (for a nondeterministic IO macro tree translation) is log-space
reducible to a context-free language, whereas the membership problem for the
composition of a deterministic and a nondeterministic tree-walking tree
translation (for a nondeterministic OI macro tree translation) is possibly
NP-complete
Macro tree transducers
Macro tree transducers are a combination of top-down tree transducers and macro grammars. They serve as a model for syntax-directed semantics in which context information can be handled. In this paper the formal model of macro tree transducers is studied by investigating typical automata theoretical topics like composition, decomposition, domains, and ranges of the induced translation classes. The extension with regular look-ahead is considered
Three hierarchies of transducers
Composition of top-down tree transducers yields a proper hierarchy of transductions and of output languages. The same is true for ETOL systems (viewed as transducers) and for two-way generalized sequential machines
Stream Processing using Grammars and Regular Expressions
In this dissertation we study regular expression based parsing and the use of
grammatical specifications for the synthesis of fast, streaming
string-processing programs.
In the first part we develop two linear-time algorithms for regular
expression based parsing with Perl-style greedy disambiguation. The first
algorithm operates in two passes in a semi-streaming fashion, using a constant
amount of working memory and an auxiliary tape storage which is written in the
first pass and consumed by the second. The second algorithm is a single-pass
and optimally streaming algorithm which outputs as much of the parse tree as is
semantically possible based on the input prefix read so far, and resorts to
buffering as many symbols as is required to resolve the next choice. Optimality
is obtained by performing a PSPACE-complete pre-analysis on the regular
expression.
In the second part we present Kleenex, a language for expressing
high-performance streaming string processing programs as regular grammars with
embedded semantic actions, and its compilation to streaming string transducers
with worst-case linear-time performance. Its underlying theory is based on
transducer decomposition into oracle and action machines, and a finite-state
specialization of the streaming parsing algorithm presented in the first part.
In the second part we also develop a new linear-time streaming parsing
algorithm for parsing expression grammars (PEG) which generalizes the regular
grammars of Kleenex. The algorithm is based on a bottom-up tabulation algorithm
reformulated using least fixed points and evaluated using an instance of the
chaotic iteration scheme by Cousot and Cousot
FliPpr: A Prettier Invertible Printing System
When implementing a programming language, we often write
a parser and a pretty-printer. However, manually writing both programs
is not only tedious but also error-prone; it may happen that a pretty-printed
result is not correctly parsed. In this paper, we propose FliPpr,
which is a program transformation system that uses program inversion
to produce a CFG parser from a pretty-printer. This novel approach
has the advantages of fine-grained control over pretty-printing, and easy
reuse of existing efficient pretty-printer and parser implementations
- …