Search CORE

3,039 research outputs found

Minimization of deterministic top-down tree automata

Author: Fülöp Zoltán
Vágvölgyi Sándor
Publication venue
Publication date: 01/01/2017
Field of study

We consider offline sensing unranked top-down tree automata in which the state transitions are computed by bimachines. We give a polynomial time algorithm for minimizing such tree automata when they are state-separated

University of Szeged

Determinization and Minimization of Automata for Nested Words Revisited

Author: Niehren Joachim
Sakho Momar
Publication venue: 'MDPI AG'
Publication date: 24/02/2021
Field of study

International audienceWe consider the problem of determinizing and minimizing automata for nested words in practice. For this we compile the nested regular expressions (

NRE_s

) from the usual XPath benchmark to nested word automata (

NW

A_s

). The determinization of these

NW

A_s

, however, fails to produce reasonably small automata. In the best case, huge deterministic

NW

A_s

are produced after few hours, even for relatively small

NRE_s

of the benchmark. We propose a different approach to the determinization of automata for nested words. For this, we introduce stepwise hedge automata (

SHA_s

) that generalize naturally on both (stepwise) tree automata and on finite word automata. We then show how to determinize

SHA_s

, yielding reasonably small deterministic automata for the

NRE_s

from the XPath benchmark. The size of deterministic

SHA_s

automata can be reduced further by a novel minimization algorithm for a subclass of

SHA_s

. In order to understand why the new approach to determinization and minimization works so nicely, we investigate the relationship between

NWA_s

and

SHA_s

further. Clearly, deterministic

SHA_s

can be compiled to deterministic NWAs in linear time, and conversely,

NW

A_s

can be compiled to nondeterministic

SHA_s

in polynomial time. Therefore, we can use

SHA_s

as intermediates for determinizing

NWA_s

, while avoiding the huge size increase with the usual determinization algorithm for

NWA_s

. Notably, the NWAs obtained from the

SHA_s

perform bottom-up and left-to-right computations only, but no top-down computations. This

NWA

-behavior can be distinguished syntactically by the (weak) single-entry property, suggesting a close relationship between

SHA_s

and single-entry

NWA_s

. In particular, it turns out that the usual determinization algorithm for

NWA_s

behaves well for single-entry

NWA_s

, while it quickly explodes without the single-entry property. Furthermore, it is known that the class of deterministic multi-module single-entry

NWA_s

enjoys unique minimization. The subclass of deterministic

SHA_s

to which our novel minimization algorithm applies is different though, in that we do not impose multiple modules. As further optimizations for reducing the sizes of the constructed

SHA_s

, we propose schema-based cleaning and symbolic representations based on apply-else rules, that can be maintained by determinization. We implemented the optimizations and report the experimental results for the automata constructed for the XPathMark benchmark

Multidisciplinary Digital Publishing Institute

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Operational State Complexity of Deterministic Unranked Tree Automata

Author: Giovanni Pighizzini
Ian McQuillan
Kai Salomaa
Xiaoxue Piao
Publication venue: 'Open Publishing Association'
Publication date: 01/08/2010
Field of study

We consider the state complexity of basic operations on tree languages recognized by deterministic unranked tree automata. For the operations of union and intersection the upper and lower bounds of both weakly and strongly deterministic tree automata are obtained. For tree concatenation we establish a tight upper bound that is of a different order than the known state complexity of concatenation of regular string languages. We show that (n+1) ( (m+1)2^n-2^(n-1) )-1 vertical states are sufficient, and necessary in the worst case, to recognize the concatenation of tree languages recognized by (strongly or weakly) deterministic automata with, respectively, m and n vertical states.Comment: In Proceedings DCFS 2010, arXiv:1008.127

arXiv.org e-Print Archive

Crossref

Directory of Open Access Journals

Practical experiments with regular approximation of context-free languages

Author: Nederhof Mark-Jan
Publication venue
Publication date: 25/10/1999
Field of study

Several methods are discussed that construct a finite automaton given a context-free grammar, including both methods that lead to subsets and those that lead to supersets of the original context-free language. Some of these methods of regular approximation are new, and some others are presented here in a more refined form with respect to existing literature. Practical experiments with the different methods of regular approximation are performed for spoken-language input: hypotheses from a speech recognizer are filtered through a finite automaton.Comment: 28 pages. To appear in Computational Linguistics 26(1), March 200

arXiv.org e-Print Archive

CiteSeerX

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

Regular Languages meet Prefix Sorting

Author: Alanko Jarno
D'Agostino Giovanna
Policriti Alberto
Prezza Nicola
Publication venue
Publication date: 09/07/2019
Field of study

Indexing strings via prefix (or suffix) sorting is, arguably, one of the most successful algorithmic techniques developed in the last decades. Can indexing be extended to languages? The main contribution of this paper is to initiate the study of the sub-class of regular languages accepted by an automaton whose states can be prefix-sorted. Starting from the recent notion of Wheeler graph [Gagie et al., TCS 2017]-which extends naturally the concept of prefix sorting to labeled graphs-we investigate the properties of Wheeler languages, that is, regular languages admitting an accepting Wheeler finite automaton. Interestingly, we characterize this family as the natural extension of regular languages endowed with the co-lexicographic ordering: when sorted, the strings belonging to a Wheeler language are partitioned into a finite number of co-lexicographic intervals, each formed by elements from a single Myhill-Nerode equivalence class. Moreover: (i) We show that every Wheeler NFA (WNFA) with

n

states admits an equivalent Wheeler DFA (WDFA) with at most

2n-1-|\Sigma|

states that can be computed in

O(n^3)

time. This is in sharp contrast with general NFAs. (ii) We describe a quadratic algorithm to prefix-sort a proper superset of the WDFAs, a

O(n\log n)

-time online algorithm to sort acyclic WDFAs, and an optimal linear-time offline algorithm to sort general WDFAs. By contribution (i), our algorithms can also be used to index any WNFA at the moderate price of doubling the automaton's size. (iii) We provide a minimization theorem that characterizes the smallest WDFA recognizing the same language of any input WDFA. The corresponding constructive algorithm runs in optimal linear time in the acyclic case, and in

O(n\log n)

time in the general case. (iv) We show how to compute the smallest WDFA equivalent to any acyclic DFA in nearly-optimal time.Comment: added minimization theorems; uploaded submitted version; New version with new results (W-MH theorem, linear determinization), added author: Giovanna D'Agostin

arXiv.org e-Print Archive

Crossref

Archivio della ricerca- LUISS Libera Università Internazionale degli Studi Sociali Guido Carli di Roma