2,898 research outputs found

    State Elimination Ordering Strategies: Some Experimental Results

    Full text link
    Recently, the problem of obtaining a short regular expression equivalent to a given finite automaton has been intensively investigated. Algorithms for converting finite automata to regular expressions have an exponential blow-up in the worst-case. To overcome this, simple heuristic methods have been proposed. In this paper we analyse some of the heuristics presented in the literature and propose new ones. We also present some experimental comparative results based on uniform random generated deterministic finite automata.Comment: In Proceedings DCFS 2010, arXiv:1008.127

    Digraph Complexity Measures and Applications in Formal Language Theory

    Full text link
    We investigate structural complexity measures on digraphs, in particular the cycle rank. This concept is intimately related to a classical topic in formal language theory, namely the star height of regular languages. We explore this connection, and obtain several new algorithmic insights regarding both cycle rank and star height. Among other results, we show that computing the cycle rank is NP-complete, even for sparse digraphs of maximum outdegree 2. Notwithstanding, we provide both a polynomial-time approximation algorithm and an exponential-time exact algorithm for this problem. The former algorithm yields an O((log n)^(3/2))- approximation in polynomial time, whereas the latter yields the optimum solution, and runs in time and space O*(1.9129^n) on digraphs of maximum outdegree at most two. Regarding the star height problem, we identify a subclass of the regular languages for which we can precisely determine the computational complexity of the star height problem. Namely, the star height problem for bideterministic languages is NP-complete, and this holds already for binary alphabets. Then we translate the algorithmic results concerning cycle rank to the bideterministic star height problem, thus giving a polynomial-time approximation as well as a reasonably fast exact exponential algorithm for bideterministic star height.Comment: 19 pages, 1 figur

    Joining Extractions of Regular Expressions

    Get PDF
    Regular expressions with capture variables, also known as "regex formulas," extract relations of spans (interval positions) from text. These relations can be further manipulated via Relational Algebra as studied in the context of document spanners, Fagin et al.'s formal framework for information extraction. We investigate the complexity of querying text by Conjunctive Queries (CQs) and Unions of CQs (UCQs) on top of regex formulas. We show that the lower bounds (NP-completeness and W[1]-hardness) from the relational world also hold in our setting; in particular, hardness hits already single-character text! Yet, the upper bounds from the relational world do not carry over. Unlike the relational world, acyclic CQs, and even gamma-acyclic CQs, are hard to compute. The source of hardness is that it may be intractable to instantiate the relation defined by a regex formula, simply because it has an exponential number of tuples. Yet, we are able to establish general upper bounds. In particular, UCQs can be evaluated with polynomial delay, provided that every CQ has a bounded number of atoms (while unions and projection can be arbitrary). Furthermore, UCQ evaluation is solvable with FPT (Fixed-Parameter Tractable) delay when the parameter is the size of the UCQ

    Symbolic Algorithms for Language Equivalence and Kleene Algebra with Tests

    Get PDF
    We first propose algorithms for checking language equivalence of finite automata over a large alphabet. We use symbolic automata, where the transition function is compactly represented using a (multi-terminal) binary decision diagrams (BDD). The key idea consists in computing a bisimulation by exploring reachable pairs symbolically, so as to avoid redundancies. This idea can be combined with already existing optimisations, and we show in particular a nice integration with the disjoint sets forest data-structure from Hopcroft and Karp's standard algorithm. Then we consider Kleene algebra with tests (KAT), an algebraic theory that can be used for verification in various domains ranging from compiler optimisation to network programming analysis. This theory is decidable by reduction to language equivalence of automata on guarded strings, a particular kind of automata that have exponentially large alphabets. We propose several methods allowing to construct symbolic automata out of KAT expressions, based either on Brzozowski's derivatives or standard automata constructions. All in all, this results in efficient algorithms for deciding equivalence of KAT expressions

    From Finite Automata to Regular Expressions and Back--A Summary on Descriptional Complexity

    Full text link
    The equivalence of finite automata and regular expressions dates back to the seminal paper of Kleene on events in nerve nets and finite automata from 1956. In the present paper we tour a fragment of the literature and summarize results on upper and lower bounds on the conversion of finite automata to regular expressions and vice versa. We also briefly recall the known bounds for the removal of spontaneous transitions (epsilon-transitions) on non-epsilon-free nondeterministic devices. Moreover, we report on recent results on the average case descriptional complexity bounds for the conversion of regular expressions to finite automata and brand new developments on the state elimination algorithm that converts finite automata to regular expressions.Comment: In Proceedings AFL 2014, arXiv:1405.527

    Two-variable Logic with Counting and a Linear Order

    Get PDF
    We study the finite satisfiability problem for the two-variable fragment of first-order logic extended with counting quantifiers (C2) and interpreted over linearly ordered structures. We show that the problem is undecidable in the case of two linear orders (in the presence of two other binary symbols). In the case of one linear order it is NEXPTIME-complete, even in the presence of the successor relation. Surprisingly, the complexity of the problem explodes when we add one binary symbol more: C2 with one linear order and in the presence of other binary predicate symbols is equivalent, under elementary reductions, to the emptiness problem for multicounter automata

    On Varieties of Automata Enriched with an Algebraic Structure (Extended Abstract)

    Full text link
    Eilenberg correspondence, based on the concept of syntactic monoids, relates varieties of regular languages with pseudovarieties of finite monoids. Various modifications of this correspondence related more general classes of regular languages with classes of more complex algebraic objects. Such generalized varieties also have natural counterparts formed by classes of finite automata equipped with a certain additional algebraic structure. In this survey, we overview several variants of such varieties of enriched automata.Comment: In Proceedings AFL 2014, arXiv:1405.527
    • …
    corecore