109,033 research outputs found
Quotient Complexity of Regular Languages
The past research on the state complexity of operations on regular languages
is examined, and a new approach based on an old method (derivatives of regular
expressions) is presented. Since state complexity is a property of a language,
it is appropriate to define it in formal-language terms as the number of
distinct quotients of the language, and to call it "quotient complexity". The
problem of finding the quotient complexity of a language f(K,L) is considered,
where K and L are regular languages and f is a regular operation, for example,
union or concatenation. Since quotients can be represented by derivatives, one
can find a formula for the typical quotient of f(K,L) in terms of the quotients
of K and L. To obtain an upper bound on the number of quotients of f(K,L) all
one has to do is count how many such quotients are possible, and this makes
automaton constructions unnecessary. The advantages of this point of view are
illustrated by many examples. Moreover, new general observations are presented
to help in the estimation of the upper bounds on quotient complexity of regular
operations
Position Automaton Construction for Regular Expressions with Intersection
Positions and derivatives are two essential notions in the conversion methods from regular expressions to equivalent finite automata. Partial derivative based methods have recently been extended to regular expressions with intersection. In this paper, we present a position automaton construction for those expressions. This construction generalizes the notion of position making it compatible with intersection. The resulting automaton is homogeneous and has the partial derivative automaton as its quotient
POSIX lexing with derivatives of regular expressions (proof pearl)
Brzozowski introduced the notion of derivatives for regular expressions. They can be used for a very simple regular expression matching algorithm. Sulzmann and Lu cleverly extended this algorithm in order to deal with POSIX matching, which is the underlying disambiguation strategy for regular expressions needed in lexers. Sulzmann and Lu have made available on-line what they call a “rigorous proof” of the correctness of their algorithm w.r.t. their specification; regrettably, it appears to us to have unfillable gaps. In the first part of this paper we give our inductive definition of what a POSIX value is and show (i) that such a value is unique (for given regular expression and string being matched) and (ii) that Sulzmann and Lu’s algorithm always generates such a value (provided that the regular expression matches the string). We also prove the correctness of an optimised version of the POSIX matching algorithm. Our definitions and proof are much simpler than those by Sulzmann and Lu and can be easily formalised in Isabelle/HOL. In the second part we analyse the correctness argument by Sulzmann and Lu and explain why the gaps in this argument cannot be filled easily.Postprin
Reordering Derivatives of Trace Closures of Regular Languages
We provide syntactic derivative-like operations, defined by recursion on regular expressions, in the styles of both Brzozowski and Antimirov, for trace closures of regular languages. Just as the Brzozowski and Antimirov derivative operations for regular languages, these syntactic reordering derivative operations yield deterministic and nondeterministic automata respectively. But trace closures of regular languages are in general not regular, hence these automata cannot generally be finite. Still, as we show, for star-connected expressions, the Antimirov and Brzozowski automata, suitably quotiented, are finite. We also define a refined version of the Antimirov reordering derivative operation where parts-of-derivatives (states of the automaton) are nonempty lists of regular expressions rather than single regular expressions. We define the uniform scattering rank of a language and show that, for a regexp whose language has finite uniform scattering rank, the truncation of the (generally infinite) refined Antimirov automaton, obtained by removing long states, is finite without any quotienting, but still accepts the trace closure. We also show that star-connected languages have finite uniform scattering rank
- …