Search CORE

5,320 research outputs found

Regular Languages meet Prefix Sorting

Author: Alanko Jarno
D'Agostino Giovanna
Policriti Alberto
Prezza Nicola
Publication venue
Publication date: 09/07/2019
Field of study

Indexing strings via prefix (or suffix) sorting is, arguably, one of the most successful algorithmic techniques developed in the last decades. Can indexing be extended to languages? The main contribution of this paper is to initiate the study of the sub-class of regular languages accepted by an automaton whose states can be prefix-sorted. Starting from the recent notion of Wheeler graph [Gagie et al., TCS 2017]-which extends naturally the concept of prefix sorting to labeled graphs-we investigate the properties of Wheeler languages, that is, regular languages admitting an accepting Wheeler finite automaton. Interestingly, we characterize this family as the natural extension of regular languages endowed with the co-lexicographic ordering: when sorted, the strings belonging to a Wheeler language are partitioned into a finite number of co-lexicographic intervals, each formed by elements from a single Myhill-Nerode equivalence class. Moreover: (i) We show that every Wheeler NFA (WNFA) with

n

states admits an equivalent Wheeler DFA (WDFA) with at most

2n-1-|\Sigma|

states that can be computed in

O(n^3)

time. This is in sharp contrast with general NFAs. (ii) We describe a quadratic algorithm to prefix-sort a proper superset of the WDFAs, a

O(n\log n)

-time online algorithm to sort acyclic WDFAs, and an optimal linear-time offline algorithm to sort general WDFAs. By contribution (i), our algorithms can also be used to index any WNFA at the moderate price of doubling the automaton's size. (iii) We provide a minimization theorem that characterizes the smallest WDFA recognizing the same language of any input WDFA. The corresponding constructive algorithm runs in optimal linear time in the acyclic case, and in

O(n\log n)

time in the general case. (iv) We show how to compute the smallest WDFA equivalent to any acyclic DFA in nearly-optimal time.Comment: added minimization theorems; uploaded submitted version; New version with new results (W-MH theorem, linear determinization), added author: Giovanna D'Agostin

arXiv.org e-Print Archive

Crossref

Archivio della ricerca- LUISS Libera Università Internazionale degli Studi Sociali Guido Carli di Roma

JGraphT -- A Java library for graph data structures and algorithms

Author: Kinable Joris
Michail Dimitrios
Naveh Barak
Sichi John V
Publication venue
Publication date: 03/02/2020
Field of study

Mathematical software and graph-theoretical algorithmic packages to efficiently model, analyze and query graphs are crucial in an era where large-scale spatial, societal and economic network data are abundantly available. One such package is JGraphT, a programming library which contains very efficient and generic graph data-structures along with a large collection of state-of-the-art algorithms. The library is written in Java with stability, interoperability and performance in mind. A distinctive feature of this library is the ability to model vertices and edges as arbitrary objects, thereby permitting natural representations of many common networks including transportation, social and biological networks. Besides classic graph algorithms such as shortest-paths and spanning-tree algorithms, the library contains numerous advanced algorithms: graph and subgraph isomorphism; matching and flow problems; approximation algorithms for NP-hard problems such as independent set and TSP; and several more exotic algorithms such as Berge graph detection. Due to its versatility and generic design, JGraphT is currently used in large-scale commercial, non-commercial and academic research projects. In this work we describe in detail the design and underlying structure of the library, and discuss its most important features and algorithms. A computational study is conducted to evaluate the performance of JGraphT versus a number of similar libraries. Experiments on a large number of graphs over a variety of popular algorithms show that JGraphT is highly competitive with other established libraries such as NetworkX or the BGL.Comment: Major Revisio

arXiv.org e-Print Archive

Pure OAI Repository

VPSPACE and a transfer theorem over the complex field

Author: Koiran Pascal
Perifel Sylvain
Publication venue
Publication date: 01/01/2006
Field of study

We extend the transfer theorem of [KP2007] to the complex field. That is, we investigate the links between the class VPSPACE of families of polynomials and the Blum-Shub-Smale model of computation over C. Roughly speaking, a family of polynomials is in VPSPACE if its coefficients can be computed in polynomial space. Our main result is that if (uniform, constant-free) VPSPACE families can be evaluated efficiently then the class PAR of decision problems that can be solved in parallel polynomial time over the complex field collapses to P. As a result, one must first be able to show that there are VPSPACE families which are hard to evaluate in order to separate P from NP over C, or even from PAR.Comment: 14 page

arXiv.org e-Print Archive

HAL-ENS-LYON

CiteSeerX

INRIA a CCSD electronic archive server

Hal-Diderot

Space efficient merging of de Bruijn graphs and Wheeler graphs

Author: Egidi Lavinia
Louza Felipe A.
Manzini Giovanni
Publication venue
Publication date: 12/07/2021
Field of study

The merging of succinct data structures is a well established technique for the space efficient construction of large succinct indexes. In the first part of the paper we propose a new algorithm for merging succinct representations of de Bruijn graphs. Our algorithm has the same asymptotic cost of the state of the art algorithm for the same problem but it uses less than half of its working space. A novel important feature of our algorithm, not found in any of the existing tools, is that it can compute the Variable Order succinct representation of the union graph within the same asymptotic time/space bounds. In the second part of the paper we consider the more general problem of merging succinct representations of Wheeler graphs, a recently introduced graph family which includes as special cases de Bruijn graphs and many other known succinct indexes based on the BWT or one of its variants. We show that Wheeler graphs merging is in general a much more difficult problem, and we provide a space efficient algorithm for the slightly simplified problem of determining whether the union graph has an ordering that satisfies the Wheeler conditions.Comment: 24 pages, 10 figures. arXiv admin note: text overlap with arXiv:1902.0288

arXiv.org e-Print Archive

Archivio Istituzionale della Ricerca- Università del Piemonte Orientale

Logic and Automata

Author
Publication venue: 'Amsterdam University Press'
Publication date
Field of study

Mathematical logic and automata theory are two scientific disciplines with a fundamentally close relationship. The authors of Logic and Automata take the occasion of the sixtieth birthday of Wolfgang Thomas to present a tour d'horizon of automata theory and logic. The twenty papers in this volume cover many different facets of logic and automata theory, emphasizing the connections to other disciplines such as games, algorithms, and semigroup theory, as well as discussing current challenges in the field

OAPEN Library

On Indexing and Compressing Finite Automata

Author: Cotumaccio Nicola
Prezza Nicola
Publication venue
Publication date: 15/07/2020
Field of study

An index for a finite automaton is a powerful data structure that supports locating paths labeled with a query pattern, thus solving pattern matching on the underlying regular language. In this paper, we solve the long-standing problem of indexing arbitrary finite automata. Our solution consists in finding a partial co-lexicographic order of the states and proving, as in the total order case, that states reached by a given string form one interval on the partial order, thus enabling indexing. We provide a lower bound stating that such an interval requires

O(p)

words to be represented,

p

being the order's width (i.e. the size of its largest antichain). Indeed, we show that

p

determines the complexity of several fundamental problems on finite automata: (i) Letting

\sigma

be the alphabet size, we provide an encoding for NFAs using

\lceil\log \sigma\rceil + 2\lceil\log p\rceil + 2

bits per transition and a smaller encoding for DFAs using

\lceil\log \sigma\rceil + \lceil\log p\rceil + 2

bits per transition. This is achieved by generalizing the Burrows-Wheeler transform to arbitrary automata. (ii) We show that indexed pattern matching can be solved in

\tilde O(m\cdot p^2)

query time on NFAs. (iii) We provide a polynomial-time algorithm to index DFAs, while matching the optimal value for

p

. On the other hand, we prove that the problem is NP-hard on NFAs. (iv) We show that, in the worst case, the classic powerset construction algorithm for NFA determinization generates an equivalent DFA of size

2^p(n-p+1)-1

, where

n

is the number of NFA's states

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della ricerca - Università degli Studi di Venezia Ca' Foscari

An Abstract Machine for Unification Grammars

Author: Wintner Shuly
Publication venue
Publication date: 01/01/1997
Field of study

This work describes the design and implementation of an abstract machine, Amalia, for the linguistic formalism ALE, which is based on typed feature structures. This formalism is one of the most widely accepted in computational linguistics and has been used for designing grammars in various linguistic theories, most notably HPSG. Amalia is composed of data structures and a set of instructions, augmented by a compiler from the grammatical formalism to the abstract instructions, and a (portable) interpreter of the abstract instructions. The effect of each instruction is defined using a low-level language that can be executed on ordinary hardware. The advantages of the abstract machine approach are twofold. From a theoretical point of view, the abstract machine gives a well-defined operational semantics to the grammatical formalism. This ensures that grammars specified using our system are endowed with well defined meaning. It enables, for example, to formally verify the correctness of a compiler for HPSG, given an independent definition. From a practical point of view, Amalia is the first system that employs a direct compilation scheme for unification grammars that are based on typed feature structures. The use of amalia results in a much improved performance over existing systems. In order to test the machine on a realistic application, we have developed a small-scale, HPSG-based grammar for a fragment of the Hebrew language, using Amalia as the development platform. This is the first application of HPSG to a Semitic language.Comment: Doctoral Thesis, 96 pages, many postscript figures, uses pstricks, pst-node, psfig, fullname and a macros fil

arXiv.org e-Print Archive

CiteSeerX

Recommended from our members

Proceedings of QG2010: The Third Workshop on Question Generation

Author: Boyer Kristy Elizabeth
Piwek Paul
Publication venue: questiongeneration.org
Publication date: 18/06/2010
Field of study

These are the peer-reviewed proceedings of "QG2010, The Third Workshop on Question Generation". The workshop included a special track for "QGSTEC2010: The First Question Generation Shared Task and Evaluation Challenge". QG2010 was held as part of The Tenth International Conference on Intelligent Tutoring Systems (ITS2010)

Open Research Online (The Open University)

Order-Related Problems Parameterized by Width

Author: Arrighi Emmanuel
Publication venue: The University of Bergen
Publication date: 01/01/2022
Field of study

In the main body of this thesis, we study two different order theoretic problems. The first problem, called Completion of an Ordering, asks to extend a given finite partial order to a complete linear order while respecting some weight constraints. The second problem is an order reconfiguration problem under width constraints. While the Completion of an Ordering problem is NP-complete, we show that it lies in FPT when parameterized by the interval width of ρ. This ordering problem can be used to model several ordering problems stemming from diverse application areas, such as graph drawing, computational social choice, and computer memory management. Each application yields a special partial order ρ. We also relate the interval width of ρ to parameterizations for these problems that have been studied earlier in the context of these applications, sometimes improving on parameterized algorithms that have been developed for these parameterizations before. This approach also gives some practical sub-exponential time algorithms for ordering problems. In our second main result, we combine our parameterized approach with the paradigm of solution diversity. The idea of solution diversity is that instead of aiming at the development of algorithms that output a single optimal solution, the goal is to investigate algorithms that output a small set of sufficiently good solutions that are sufficiently diverse from one another. In this way, the user has the opportunity to choose the solution that is most appropriate to the context at hand. It also displays the richness of the solution space. There, we show that the considered diversity version of the Completion of an Ordering problem is fixed-parameter tractable with respect to natural paramaters that capture the notion of diversity and the notion of sufficiently good solutions. We apply this algorithm in the study of the Kemeny Rank Aggregation class of problems, a well-studied class of problems lying in the intersection of order theory and social choice theory. Up to this point, we have been looking at problems where the goal is to find an optimal solution or a diverse set of good solutions. In the last part, we shift our focus from finding solutions to studying the solution space of a problem. There we consider the following order reconfiguration problem: Given a graph G together with linear orders τ and τ ′ of the vertices of G, can one transform τ into τ ′ by a sequence of swaps of adjacent elements in such a way that at each time step the resulting linear order has cutwidth (pathwidth) at most w? We show that this problem always has an affirmative answer when the input linear orders τ and τ ′ have cutwidth (pathwidth) at most w/2. Using this result, we establish a connection between two apparently unrelated problems: the reachability problem for two-letter string rewriting systems and the graph isomorphism problem for graphs of bounded cutwidth. This opens an avenue for the study of the famous graph isomorphism problem using techniques from term rewriting theory. In addition to the main part of this work, we present results on two unrelated problems, namely on the Steiner Tree problem and on the Intersection Non-emptiness problem from automata theory.Doktorgradsavhandlin

University of Bergen

NORA - Norwegian Open Research Archives