2,898 research outputs found
State Elimination Ordering Strategies: Some Experimental Results
Recently, the problem of obtaining a short regular expression equivalent to a
given finite automaton has been intensively investigated. Algorithms for
converting finite automata to regular expressions have an exponential blow-up
in the worst-case. To overcome this, simple heuristic methods have been
proposed.
In this paper we analyse some of the heuristics presented in the literature
and propose new ones. We also present some experimental comparative results
based on uniform random generated deterministic finite automata.Comment: In Proceedings DCFS 2010, arXiv:1008.127
Digraph Complexity Measures and Applications in Formal Language Theory
We investigate structural complexity measures on digraphs, in particular the
cycle rank. This concept is intimately related to a classical topic in formal
language theory, namely the star height of regular languages. We explore this
connection, and obtain several new algorithmic insights regarding both cycle
rank and star height. Among other results, we show that computing the cycle
rank is NP-complete, even for sparse digraphs of maximum outdegree 2.
Notwithstanding, we provide both a polynomial-time approximation algorithm and
an exponential-time exact algorithm for this problem. The former algorithm
yields an O((log n)^(3/2))- approximation in polynomial time, whereas the
latter yields the optimum solution, and runs in time and space O*(1.9129^n) on
digraphs of maximum outdegree at most two. Regarding the star height problem,
we identify a subclass of the regular languages for which we can precisely
determine the computational complexity of the star height problem. Namely, the
star height problem for bideterministic languages is NP-complete, and this
holds already for binary alphabets. Then we translate the algorithmic results
concerning cycle rank to the bideterministic star height problem, thus giving a
polynomial-time approximation as well as a reasonably fast exact exponential
algorithm for bideterministic star height.Comment: 19 pages, 1 figur
Joining Extractions of Regular Expressions
Regular expressions with capture variables, also known as "regex formulas,"
extract relations of spans (interval positions) from text. These relations can
be further manipulated via Relational Algebra as studied in the context of
document spanners, Fagin et al.'s formal framework for information extraction.
We investigate the complexity of querying text by Conjunctive Queries (CQs) and
Unions of CQs (UCQs) on top of regex formulas. We show that the lower bounds
(NP-completeness and W[1]-hardness) from the relational world also hold in our
setting; in particular, hardness hits already single-character text! Yet, the
upper bounds from the relational world do not carry over. Unlike the relational
world, acyclic CQs, and even gamma-acyclic CQs, are hard to compute. The source
of hardness is that it may be intractable to instantiate the relation defined
by a regex formula, simply because it has an exponential number of tuples. Yet,
we are able to establish general upper bounds. In particular, UCQs can be
evaluated with polynomial delay, provided that every CQ has a bounded number of
atoms (while unions and projection can be arbitrary). Furthermore, UCQ
evaluation is solvable with FPT (Fixed-Parameter Tractable) delay when the
parameter is the size of the UCQ
Symbolic Algorithms for Language Equivalence and Kleene Algebra with Tests
We first propose algorithms for checking language equivalence of finite
automata over a large alphabet. We use symbolic automata, where the transition
function is compactly represented using a (multi-terminal) binary decision
diagrams (BDD). The key idea consists in computing a bisimulation by exploring
reachable pairs symbolically, so as to avoid redundancies. This idea can be
combined with already existing optimisations, and we show in particular a nice
integration with the disjoint sets forest data-structure from Hopcroft and
Karp's standard algorithm. Then we consider Kleene algebra with tests (KAT), an
algebraic theory that can be used for verification in various domains ranging
from compiler optimisation to network programming analysis. This theory is
decidable by reduction to language equivalence of automata on guarded strings,
a particular kind of automata that have exponentially large alphabets. We
propose several methods allowing to construct symbolic automata out of KAT
expressions, based either on Brzozowski's derivatives or standard automata
constructions. All in all, this results in efficient algorithms for deciding
equivalence of KAT expressions
From Finite Automata to Regular Expressions and Back--A Summary on Descriptional Complexity
The equivalence of finite automata and regular expressions dates back to the
seminal paper of Kleene on events in nerve nets and finite automata from 1956.
In the present paper we tour a fragment of the literature and summarize results
on upper and lower bounds on the conversion of finite automata to regular
expressions and vice versa. We also briefly recall the known bounds for the
removal of spontaneous transitions (epsilon-transitions) on non-epsilon-free
nondeterministic devices. Moreover, we report on recent results on the average
case descriptional complexity bounds for the conversion of regular expressions
to finite automata and brand new developments on the state elimination
algorithm that converts finite automata to regular expressions.Comment: In Proceedings AFL 2014, arXiv:1405.527
Two-variable Logic with Counting and a Linear Order
We study the finite satisfiability problem for the two-variable fragment of
first-order logic extended with counting quantifiers (C2) and interpreted over
linearly ordered structures. We show that the problem is undecidable in the
case of two linear orders (in the presence of two other binary symbols). In the
case of one linear order it is NEXPTIME-complete, even in the presence of the
successor relation. Surprisingly, the complexity of the problem explodes when
we add one binary symbol more: C2 with one linear order and in the presence of
other binary predicate symbols is equivalent, under elementary reductions, to
the emptiness problem for multicounter automata
On Varieties of Automata Enriched with an Algebraic Structure (Extended Abstract)
Eilenberg correspondence, based on the concept of syntactic monoids, relates
varieties of regular languages with pseudovarieties of finite monoids. Various
modifications of this correspondence related more general classes of regular
languages with classes of more complex algebraic objects. Such generalized
varieties also have natural counterparts formed by classes of finite automata
equipped with a certain additional algebraic structure. In this survey, we
overview several variants of such varieties of enriched automata.Comment: In Proceedings AFL 2014, arXiv:1405.527
- …