2,800 research outputs found
On the State Complexity of Partial Derivative Automata For Regular Expressions with Intersection
Extended regular expressions (with complement and intersection) are used in many applications due to their succinctness. In particular, regular expressions extended with intersection only (also called semi-extended) can already be exponentially smaller than standard regular expressions or equivalent nondeterministic finite automata (NFA). For practical purposes it is important to study the average behaviour of conversions between these models. In this paper, we focus on the conversion of regular expressions with intersection to nondeterministic finite automata, using partial derivatives and the notion of support. First, we give a tight upper bound of 2O(n) for the worst-case number of states of the resulting partial derivative automaton, where n is the size of the expression. Using the framework of analytic combinatorics, we then establish an upper bound of (1.056 + o(1))n for its asymptotic average-state complexity, which is significantly smaller than the one for the worst case. (c) IFIP International Federation for Information Processing 2016
From Finite Automata to Regular Expressions and Back--A Summary on Descriptional Complexity
The equivalence of finite automata and regular expressions dates back to the
seminal paper of Kleene on events in nerve nets and finite automata from 1956.
In the present paper we tour a fragment of the literature and summarize results
on upper and lower bounds on the conversion of finite automata to regular
expressions and vice versa. We also briefly recall the known bounds for the
removal of spontaneous transitions (epsilon-transitions) on non-epsilon-free
nondeterministic devices. Moreover, we report on recent results on the average
case descriptional complexity bounds for the conversion of regular expressions
to finite automata and brand new developments on the state elimination
algorithm that converts finite automata to regular expressions.Comment: In Proceedings AFL 2014, arXiv:1405.527
Partial Derivative Automaton for Regular Expressions with Shuffle
We generalize the partial derivative automaton to regular expressions with
shuffle and study its size in the worst and in the average case. The number of
states of the partial derivative automata is in the worst case at most 2^m,
where m is the number of letters in the expression, while asymptotically and on
average it is no more than (4/3)^m
Testing the Equivalence of Regular Languages
The minimal deterministic finite automaton is generally used to determine
regular languages equality. Antimirov and Mosses proposed a rewrite system for
deciding regular expressions equivalence of which Almeida et al. presented an
improved variant. Hopcroft and Karp proposed an almost linear algorithm for
testing the equivalence of two deterministic finite automata that avoids
minimisation. In this paper we improve the best-case running time, present an
extension of this algorithm to non-deterministic finite automata, and establish
a relationship between this algorithm and the one proposed in Almeida et al. We
also present some experimental comparative results. All these algorithms are
closely related with the recent coalgebraic approach to automata proposed by
Rutten
Symbolic Solving of Extended Regular Expression Inequalities
This paper presents a new solution to the containment problem for extended
regular expressions that extends basic regular expressions with intersection
and complement operators and consider regular expressions on infinite alphabets
based on potentially infinite character sets. Standard approaches deciding the
containment do not take extended operators or character sets into account. The
algorithm avoids the translation to an expression-equivalent automaton and
provides a purely symbolic term rewriting systems for solving regular
expressions inequalities.
We give a new symbolic decision procedure for the containment problem based
on Brzozowski's regular expression derivatives and Antimirov's rewriting
approach to check containment. We generalize Brzozowski's syntactic derivative
operator to two derivative operators that work with respect to (potentially
infinite) representable character sets.Comment: Technical Repor
Regular Expressions and Transducers over Alphabet-invariant and User-defined Labels
We are interested in regular expressions and transducers that represent word
relations in an alphabet-invariant way---for example, the set of all word pairs
u,v where v is a prefix of u independently of what the alphabet is. Current
software systems of formal language objects do not have a mechanism to define
such objects. We define transducers in which transition labels involve what we
call set specifications, some of which are alphabet invariant. In fact, we give
a more broad definition of automata-type objects, called labelled graphs, where
each transition label can be any string, as long as that string represents a
subset of a certain monoid. Then, the behaviour of the labelled graph is a
subset of that monoid. We do the same for regular expressions. We obtain
extensions of a few classic algorithmic constructions on ordinary regular
expressions and transducers at the broad level of labelled graphs and in such a
way that the computational efficiency of the extended constructions is not
sacrificed. For regular expressions with set specs we obtain the corresponding
partial derivative automata. For transducers with set specs we obtain further
algorithms that can be applied to questions about independent regular
languages, in particular the witness version of the independent property
satisfaction question
Derivative Based Extended Regular Expression Matching Supporting Intersection, Complement and Lookarounds
Regular expressions are widely used in software. Various regular expression
engines support different combinations of extensions to classical regular
constructs such as Kleene star, concatenation, nondeterministic choice (union
in terms of match semantics). The extensions include e.g. anchors, lookarounds,
counters, backreferences. The properties of combinations of such extensions
have been subject of active recent research.
In the current paper we present a symbolic derivatives based approach to
finding matches to regular expressions that, in addition to the classical
regular constructs, also support complement, intersection and lookarounds (both
negative and positive lookaheads and lookbacks). The theory of computing
symbolic derivatives and determining nullability given an input string is
presented that shows that such a combination of extensions yields a match
semantics that corresponds to an effective Boolean algebra, which in turn opens
up possibilities of applying various Boolean logic rewrite rules to optimize
the search for matches.
In addition to the theoretical framework we present an implementation of the
combination of extensions to demonstrate the efficacy of the approach
accompanied with practical examples
- …