2,800 research outputs found

    On the State Complexity of Partial Derivative Automata For Regular Expressions with Intersection

    Get PDF
    Extended regular expressions (with complement and intersection) are used in many applications due to their succinctness. In particular, regular expressions extended with intersection only (also called semi-extended) can already be exponentially smaller than standard regular expressions or equivalent nondeterministic finite automata (NFA). For practical purposes it is important to study the average behaviour of conversions between these models. In this paper, we focus on the conversion of regular expressions with intersection to nondeterministic finite automata, using partial derivatives and the notion of support. First, we give a tight upper bound of 2O(n) for the worst-case number of states of the resulting partial derivative automaton, where n is the size of the expression. Using the framework of analytic combinatorics, we then establish an upper bound of (1.056 + o(1))n for its asymptotic average-state complexity, which is significantly smaller than the one for the worst case. (c) IFIP International Federation for Information Processing 2016

    From Finite Automata to Regular Expressions and Back--A Summary on Descriptional Complexity

    Full text link
    The equivalence of finite automata and regular expressions dates back to the seminal paper of Kleene on events in nerve nets and finite automata from 1956. In the present paper we tour a fragment of the literature and summarize results on upper and lower bounds on the conversion of finite automata to regular expressions and vice versa. We also briefly recall the known bounds for the removal of spontaneous transitions (epsilon-transitions) on non-epsilon-free nondeterministic devices. Moreover, we report on recent results on the average case descriptional complexity bounds for the conversion of regular expressions to finite automata and brand new developments on the state elimination algorithm that converts finite automata to regular expressions.Comment: In Proceedings AFL 2014, arXiv:1405.527

    Partial Derivative Automaton for Regular Expressions with Shuffle

    Get PDF
    We generalize the partial derivative automaton to regular expressions with shuffle and study its size in the worst and in the average case. The number of states of the partial derivative automata is in the worst case at most 2^m, where m is the number of letters in the expression, while asymptotically and on average it is no more than (4/3)^m

    Testing the Equivalence of Regular Languages

    Full text link
    The minimal deterministic finite automaton is generally used to determine regular languages equality. Antimirov and Mosses proposed a rewrite system for deciding regular expressions equivalence of which Almeida et al. presented an improved variant. Hopcroft and Karp proposed an almost linear algorithm for testing the equivalence of two deterministic finite automata that avoids minimisation. In this paper we improve the best-case running time, present an extension of this algorithm to non-deterministic finite automata, and establish a relationship between this algorithm and the one proposed in Almeida et al. We also present some experimental comparative results. All these algorithms are closely related with the recent coalgebraic approach to automata proposed by Rutten

    Symbolic Solving of Extended Regular Expression Inequalities

    Get PDF
    This paper presents a new solution to the containment problem for extended regular expressions that extends basic regular expressions with intersection and complement operators and consider regular expressions on infinite alphabets based on potentially infinite character sets. Standard approaches deciding the containment do not take extended operators or character sets into account. The algorithm avoids the translation to an expression-equivalent automaton and provides a purely symbolic term rewriting systems for solving regular expressions inequalities. We give a new symbolic decision procedure for the containment problem based on Brzozowski's regular expression derivatives and Antimirov's rewriting approach to check containment. We generalize Brzozowski's syntactic derivative operator to two derivative operators that work with respect to (potentially infinite) representable character sets.Comment: Technical Repor

    Regular Expressions and Transducers over Alphabet-invariant and User-defined Labels

    Full text link
    We are interested in regular expressions and transducers that represent word relations in an alphabet-invariant way---for example, the set of all word pairs u,v where v is a prefix of u independently of what the alphabet is. Current software systems of formal language objects do not have a mechanism to define such objects. We define transducers in which transition labels involve what we call set specifications, some of which are alphabet invariant. In fact, we give a more broad definition of automata-type objects, called labelled graphs, where each transition label can be any string, as long as that string represents a subset of a certain monoid. Then, the behaviour of the labelled graph is a subset of that monoid. We do the same for regular expressions. We obtain extensions of a few classic algorithmic constructions on ordinary regular expressions and transducers at the broad level of labelled graphs and in such a way that the computational efficiency of the extended constructions is not sacrificed. For regular expressions with set specs we obtain the corresponding partial derivative automata. For transducers with set specs we obtain further algorithms that can be applied to questions about independent regular languages, in particular the witness version of the independent property satisfaction question

    Derivative Based Extended Regular Expression Matching Supporting Intersection, Complement and Lookarounds

    Full text link
    Regular expressions are widely used in software. Various regular expression engines support different combinations of extensions to classical regular constructs such as Kleene star, concatenation, nondeterministic choice (union in terms of match semantics). The extensions include e.g. anchors, lookarounds, counters, backreferences. The properties of combinations of such extensions have been subject of active recent research. In the current paper we present a symbolic derivatives based approach to finding matches to regular expressions that, in addition to the classical regular constructs, also support complement, intersection and lookarounds (both negative and positive lookaheads and lookbacks). The theory of computing symbolic derivatives and determining nullability given an input string is presented that shows that such a combination of extensions yields a match semantics that corresponds to an effective Boolean algebra, which in turn opens up possibilities of applying various Boolean logic rewrite rules to optimize the search for matches. In addition to the theoretical framework we present an implementation of the combination of extensions to demonstrate the efficacy of the approach accompanied with practical examples
    corecore