847 research outputs found
Automata with Nested Pebbles Capture First-Order Logic with Transitive Closure
String languages recognizable in (deterministic) log-space are characterized
either by two-way (deterministic) multi-head automata, or following Immerman,
by first-order logic with (deterministic) transitive closure. Here we elaborate
this result, and match the number of heads to the arity of the transitive
closure. More precisely, first-order logic with k-ary deterministic transitive
closure has the same power as deterministic automata walking on their input
with k heads, additionally using a finite set of nested pebbles. This result is
valid for strings, ordered trees, and in general for families of graphs having
a fixed automaton that can be used to traverse the nodes of each of the graphs
in the family. Other examples of such families are grids, toruses, and
rectangular mazes. For nondeterministic automata, the logic is restricted to
positive occurrences of transitive closure.
The special case of k=1 for trees, shows that single-head deterministic
tree-walking automata with nested pebbles are characterized by first-order
logic with unary deterministic transitive closure. This refines our earlier
result that placed these automata between first-order and monadic second-order
logic on trees.Comment: Paper for Logical Methods in Computer Science, 27 pages, 1 figur
Reasoning about Regular Properties: A Comparative Study
Several new algorithms for deciding emptiness of Boolean combinations of
regular languages and of languages of alternating automata (AFA) have been
proposed recently, especially in the context of analysing regular expressions
and in string constraint solving. The new algorithms demonstrated a significant
potential, but they have never been systematically compared, neither among each
other nor with the state-of-the art implementations of existing
(non)deterministic automata-based methods. In this paper, we provide the first
such comparison as well as an overview of the existing algorithms and their
implementations. We collect a diverse benchmark mostly originating in or
related to practical problems from string constraint solving, analysing LTL
properties, and regular model checking, and evaluate collected implementations
on it. The results reveal the best tools and hint on what the best algorithms
and implementation techniques are. Roughly, although some advanced algorithms
are fast, such as antichain algorithms and reductions to IC3/PDR, they are not
as overwhelmingly dominant as sometimes presented and there is no clear winner.
The simplest NFA-based technology may be actually the best choice, depending on
the problem source and implementation style. Our findings should be highly
relevant for development of these techniques as well as for related fields such
as string constraint solving
Pattern matching in compilers
In this thesis we develop tools for effective and flexible pattern matching.
We introduce a new pattern matching system called amethyst. Amethyst is not
only a generator of parsers of programming languages, but can also serve as an
alternative to tools for matching regular expressions.
Our framework also produces dynamic parsers. Its intended use is in the
context of IDE (accurate syntax highlighting and error detection on the fly).
Amethyst offers pattern matching of general data structures. This makes it a
useful tool for implementing compiler optimizations such as constant folding,
instruction scheduling, and dataflow analysis in general.
The parsers produced are essentially top-down parsers. Linear time complexity
is obtained by introducing the novel notion of structured grammars and
regularized regular expressions. Amethyst uses techniques known from compiler
optimizations to produce effective parsers.Comment: master thesi
Generating Semantic Graph Corpora with Graph Expansion Grammar
We introduce Lovelace, a tool for creating corpora of semantic graphs. The
system uses graph expansion grammar as a representational language, thus
allowing users to craft a grammar that describes a corpus with desired
properties. When given such grammar as input, the system generates a set of
output graphs that are well-formed according to the grammar, i.e., a graph
bank. The generation process can be controlled via a number of configurable
parameters that allow the user to, for example, specify a range of desired
output graph sizes. Central use cases are the creation of synthetic data to
augment existing corpora, and as a pedagogical tool for teaching formal
language theory.Comment: In Proceedings NCMA 2023, arXiv:2309.0733
- …