Search CORE

210,910 research outputs found

Boundedness in languages of infinite words

Author: Bojańczyk Mikołaj
Colcombet Thomas
Publication venue
Publication date: 25/10/2017
Field of study

We define a new class of languages of

\omega

-words, strictly extending

\omega

-regular languages. One way to present this new class is by a type of regular expressions. The new expressions are an extension of

\omega

-regular expressions where two new variants of the Kleene star

L^*

are added:

L^B

and

L^S

. These new exponents are used to say that parts of the input word have bounded size, and that parts of the input can have arbitrarily large sizes, respectively. For instance, the expression

(a^Bb)^\omega

represents the language of infinite words over the letters

a,b

where there is a common bound on the number of consecutive letters

a

. The expression

(a^Sb)^\omega

represents a similar language, but this time the distance between consecutive

b

's is required to tend toward the infinite. We develop a theory for these languages, with a focus on decidability and closure. We define an equivalent automaton model, extending B\"uchi automata. The main technical result is a complementation lemma that works for languages where only one type of exponent---either

L^B

L^S

---is used. We use the closure and decidability results to obtain partial decidability results for the logic MSOLB, a logic obtained by extending monadic second-order logic with new quantifiers that speak about the size of sets

arXiv.org e-Print Archive

Episciences.org

The practical efficiency of regular expression membership algorithms

Author: Gray Justin P.
Publication venue: Halifax, N.S. : Saint Mary's University
Publication date: 28/04/2022
Field of study

1 online resource (71 pages) : graphs, chartsIncludes abstract.Includes bibliographical references (69-71).Regular expressions encode text patterns and define languages of symbolic words. The membership problem decides if a given word is an element of the language described by a given regular expression. This problem has various well-studied algorithms, but current research only shows asymptotic complexity and performance with respect to samples of randomly generated regular expressions. Our research aims to answer how the algorithms perform when using practical regular expressions used in the real-world on a representative test set of words. A set of compatible regular expressions have been collected from public GitHub repositories. Each compatible expression (i.e., no backreferences or improper formatting) is then converted into an equivalent unambiguous mathematical representation. For each distinct expression, we have tested Thompson, Glushkov, position, follow, and partial derivative NFA constructions, as well as partial derivatives and exponential backtracking directly on the regular expression tree. These algorithms have been implemented into a modified version of the Python’s FAdo library and include UNIX-inspired extensions such as character classes, the wild dot, and UTF-8 support. We find that efficiently constructing a small NFA is the best approach to this problem; using follow and PDDAG algorithms are experimentally shown as the best

Saint Mary's University, Halifax: Institutional Repository

Widths of regular and context-free languages

Author: Mestel David
Publication venue
Publication date: 01/01/2019
Field of study

Given a partially-ordered finite alphabet

\Sigma

and a language

L\subseteq \Sigma^*

, how large can an antichain in

L

be (where

L

is given the lexicographic ordering)? More precisely, since

L

will in general be infinite, we should ask about the rate of growth of maximum antichains consisting of words of length

n

. This fundamental property of partial orders is known as the width, and in a companion work we show that the problem of computing the information leakage permitted by a deterministic interactive system modeled as a finite-state transducer can be reduced to the problem of computing the width of a certain regular language. In this paper, we show that if

L

is regular then there is a dichotomy between polynomial and exponential antichain growth. We give a polynomial-time algorithm to distinguish the two cases, and to compute the order of polynomial growth, with the language specified as an NFA. For context-free languages we show that there is a similar dichotomy, but now the problem of distinguishing the two cases is undecidable. Finally, we generalise the lexicographic order to tree languages, and show that for regular tree languages there is a trichotomy between polynomial, exponential and doubly exponential antichain growth.Comment: 22 page

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Open Repository and Bibliography - Luxembourg

Large Aperiodic Semigroups

Author: A. Kisielewicz
G. Gomes
J. Brzozowski
J. Brzozowski
J. Brzozowski
J. Brzozowski
J. Brzozowski
J.E. Pin
J.M. Howie
M. Schützenberger
R. Diestel
S. Yu
Publication venue
Publication date: 01/01/2014
Field of study

The syntactic complexity of a regular language is the size of its syntactic semigroup. This semigroup is isomorphic to the transition semigroup of the minimal deterministic finite automaton accepting the language, that is, to the semigroup generated by transformations induced by non-empty words on the set of states of the automaton. In this paper we search for the largest syntactic semigroup of a star-free language having

n

left quotients; equivalently, we look for the largest transition semigroup of an aperiodic finite automaton with

n

states. We introduce two new aperiodic transition semigroups. The first is generated by transformations that change only one state; we call such transformations and resulting semigroups unitary. In particular, we study complete unitary semigroups which have a special structure, and we show that each maximal unitary semigroup is complete. For

n \ge 4

there exists a complete unitary semigroup that is larger than any aperiodic semigroup known to date. We then present even larger aperiodic semigroups, generated by transformations that map a non-empty subset of states to a single state; we call such transformations and semigroups semiconstant. In particular, we examine semiconstant tree semigroups which have a structure based on full binary trees. The semiconstant tree semigroups are at present the best candidates for largest aperiodic semigroups. We also prove that

2^n-1

is an upper bound on the state complexity of reversal of star-free languages, and resolve an open problem about a special case of state complexity of concatenation of star-free languages.Comment: 22 pages, 1 figure, 2 table

arXiv.org e-Print Archive

CiteSeerX

University of Waterloo's Institutional Repository

Crossref

Two-Sided Derivatives for Regular Expressions and for Hairpin Expressions

Author: F. Manea
F. Manea
J.-M. Champarnaud
J.M. Sempere
L. Kari
P. Caron
P. Caron
S. Kleene
S. Lombardy
V. Antimirov
V. Diekert
Publication venue
Publication date: 01/01/2013
Field of study

The aim of this paper is to design the polynomial construction of a finite recognizer for hairpin completions of regular languages. This is achieved by considering completions as new expression operators and by applying derivation techniques to the associated extended expressions called hairpin expressions. More precisely, we extend partial derivation of regular expressions to two-sided partial derivation of hairpin expressions and we show how to deduce a recognizer for a hairpin expression from its two-sided derived term automaton, providing an alternative proof of the fact that hairpin completions of regular languages are linear context-free.Comment: 28 page

arXiv.org e-Print Archive

HAL - Normandie Université

Crossref

From Finite Automata to Regular Expressions and Back--A Summary on Descriptional Complexity

Author: Gruber Hermann
Holzer Markus
Publication venue: 'Open Publishing Association'
Publication date: 01/05/2014
Field of study

The equivalence of finite automata and regular expressions dates back to the seminal paper of Kleene on events in nerve nets and finite automata from 1956. In the present paper we tour a fragment of the literature and summarize results on upper and lower bounds on the conversion of finite automata to regular expressions and vice versa. We also briefly recall the known bounds for the removal of spontaneous transitions (epsilon-transitions) on non-epsilon-free nondeterministic devices. Moreover, we report on recent results on the average case descriptional complexity bounds for the conversion of regular expressions to finite automata and brand new developments on the state elimination algorithm that converts finite automata to regular expressions.Comment: In Proceedings AFL 2014, arXiv:1405.527

arXiv.org e-Print Archive

Directory of Open Access Journals

Partial Derivative Automaton for Regular Expressions with Shuffle

Author: Broda Sabine
Machiavelo António
Moreira Nelma
Reis Rogério
Publication venue
Publication date: 01/01/2015
Field of study

We generalize the partial derivative automaton to regular expressions with shuffle and study its size in the worst and in the average case. The number of states of the partial derivative automata is in the worst case at most 2^m, where m is the number of letters in the expression, while asymptotically and on average it is no more than (4/3)^m

arXiv.org e-Print Archive

Repositório Aberto da Universidade do Porto