Search CORE

11 research outputs found

On lexicographic enumeration of regular and context-free languages

Author: Mäkinen Erkki
Publication venue
Publication date: 01/01/1997
Field of study

We show that it is possible to efficiently enumerate the words of a regular language in lexicographic order. The time needed for generating the next word is O(n) when enumerating words of length n. We also define a class of context-free languages for which efficient enumeration is possible

University of Szeged

Acta Cybernetica : Volume 13. Number 1.

Author
Publication venue
Publication date: 01/01/1997
Field of study

University of Szeged

Grammars for Document Spanners

Author: Peterfreund Liat
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 24th International Conference on Database Theory (ICDT 2021)
Publication date: 01/01/2021
Field of study

We propose a new grammar-based language for defining information-extractors from documents (text) that is built upon the well-studied framework of document spanners for extracting structured data from text. While previously studied formalisms for document spanners are mainly based on regular expressions, we use an extension of context-free grammars, called {extraction grammars}, to define the new class of context-free spanners. Extraction grammars are simply context-free grammars extended with variables that capture interval positions of the document, namely spans. While regular expressions are efficient for tokenizing and tagging, context-free grammars are also efficient for capturing structural properties. Indeed, we show that context-free spanners are strictly more expressive than their regular counterparts. We reason about the expressive power of our new class and present a pushdown-automata model that captures it. We show that extraction grammars can be evaluated with polynomial data complexity. Nevertheless, as the degree of the polynomial depends on the query, we present an enumeration algorithm for unambiguous extraction grammars that, after quintic preprocessing, outputs the results sequentially, without repetitions, with a constant delay between every two consecutive ones

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Dagstuhl Research Online Publication Server

Enumerating Regular Languages with Bounded Delay

Author: Amarilli Antoine
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 40th International Symposium on Theoretical Aspects of Computer Science (STACS 2023)
Publication date: 01/01/2023
Field of study

Dagstuhl Research Online Publication Server

Detecting palindromes, patterns, and borders in regular languages

Author: Anderson Terry
Loftus John
Rampersad Narad
Santean Nicolae
Shallit Jeffrey
Publication venue
Publication date: 09/06/2008
Field of study

Given a language L and a nondeterministic finite automaton M, we consider whether we can determine efficiently (in the size of M) if M accepts at least one word in L, or infinitely many words. Given that M accepts at least one word in L, we consider how long a shortest word can be. The languages L that we examine include the palindromes, the non-palindromes, the k-powers, the non-k-powers, the powers, the non-powers (also called primitive words), the words matching a general pattern, the bordered words, and the unbordered words.Comment: Full version of a paper submitted to LATA 2008. This is a new version with John Loftus added as a co-author and containing new results on unbordered word

arXiv.org e-Print Archive

CiteSeerX

Elsevier - Publisher Connector

Evaluation and Enumeration Problems for Regular Path Queries

Author: Martens Wim
Trautner Tina
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 21st International Conference on Database Theory (ICDT 2018)
Publication date: 01/01/2018
Field of study

Regular path queries (RPQs) are a central component of graph databases. We investigate decision- and enumeration problems concerning the evaluation of RPQs under several semantics that have recently been considered: arbitrary paths, shortest paths, and simple paths. Whereas arbitrary and shortest paths can be enumerated in polynomial delay, the situation is much more intricate for simple paths. For instance, already the question if a given graph contains a simple path of a certain length has cases with highly non-trivial solutions and cases that are long-standing open problems. We study RPQ evaluation for simple paths from a parameterized complexity perspective and define a class of simple transitive expressions that is prominent in practice and for which we can prove a dichotomy for the evaluation problem. We observe that, even though simple path semantics is intractable for RPQs in general, it is feasible for the vast majority of RPQs that are used in practice. At the heart of our study on simple paths is a result of independent interest: the two disjoint paths problem in directed graphs is W[1]-hard if parameterized by the length of one of the two paths

Dagstuhl Research Online Publication Server

On the structural and combinatorial properties in 2-swap word permutation graphs

Author: Adamson Duncan
Flaherty Nathan
Potapov Igor
Spirakis Paul G
Publication venue
Publication date: 04/07/2023
Field of study

In this paper, we study the graph induced by the

\textit{2-swap}

permutation on words with a fixed Parikh vector. A

2

-swap is defined as a pair of positions

s = (i, j)

where the word

w

induced by the swap

s

v

v[1] v[2] \dots v[i - 1] v[j] v[i+1] \dots v[j - 1] v[i] v[j + 1] \dots v[n]

. With these permutations, we define the

\textit{Configuration Graph}

G(P)

defined over a given Parikh vector. Each vertex in

G(P)

corresponds to a unique word with the Parikh vector

P

, with an edge between any pair of words

v

and

w

if there exists a swap

s

such that

v \circ s = w

. We provide several key combinatorial properties of this graph, including the exact diameter of this graph, the clique number of the graph, and the relationships between subgraphs within this graph. Additionally, we show that for every vertex in the graph, there exists a Hamiltonian path starting at this vertex. Finally, we provide an algorithm enumerating these paths from a given input word of length

n

with a delay of at most

O(\log n)

between outputting edges, requiring

O(n \log n)

preprocessing

University of Liverpool Repository