5,320 research outputs found

    Regular Languages meet Prefix Sorting

    Full text link
    Indexing strings via prefix (or suffix) sorting is, arguably, one of the most successful algorithmic techniques developed in the last decades. Can indexing be extended to languages? The main contribution of this paper is to initiate the study of the sub-class of regular languages accepted by an automaton whose states can be prefix-sorted. Starting from the recent notion of Wheeler graph [Gagie et al., TCS 2017]-which extends naturally the concept of prefix sorting to labeled graphs-we investigate the properties of Wheeler languages, that is, regular languages admitting an accepting Wheeler finite automaton. Interestingly, we characterize this family as the natural extension of regular languages endowed with the co-lexicographic ordering: when sorted, the strings belonging to a Wheeler language are partitioned into a finite number of co-lexicographic intervals, each formed by elements from a single Myhill-Nerode equivalence class. Moreover: (i) We show that every Wheeler NFA (WNFA) with nn states admits an equivalent Wheeler DFA (WDFA) with at most 2n1Σ2n-1-|\Sigma| states that can be computed in O(n3)O(n^3) time. This is in sharp contrast with general NFAs. (ii) We describe a quadratic algorithm to prefix-sort a proper superset of the WDFAs, a O(nlogn)O(n\log n)-time online algorithm to sort acyclic WDFAs, and an optimal linear-time offline algorithm to sort general WDFAs. By contribution (i), our algorithms can also be used to index any WNFA at the moderate price of doubling the automaton's size. (iii) We provide a minimization theorem that characterizes the smallest WDFA recognizing the same language of any input WDFA. The corresponding constructive algorithm runs in optimal linear time in the acyclic case, and in O(nlogn)O(n\log n) time in the general case. (iv) We show how to compute the smallest WDFA equivalent to any acyclic DFA in nearly-optimal time.Comment: added minimization theorems; uploaded submitted version; New version with new results (W-MH theorem, linear determinization), added author: Giovanna D'Agostin

    JGraphT -- A Java library for graph data structures and algorithms

    Full text link
    Mathematical software and graph-theoretical algorithmic packages to efficiently model, analyze and query graphs are crucial in an era where large-scale spatial, societal and economic network data are abundantly available. One such package is JGraphT, a programming library which contains very efficient and generic graph data-structures along with a large collection of state-of-the-art algorithms. The library is written in Java with stability, interoperability and performance in mind. A distinctive feature of this library is the ability to model vertices and edges as arbitrary objects, thereby permitting natural representations of many common networks including transportation, social and biological networks. Besides classic graph algorithms such as shortest-paths and spanning-tree algorithms, the library contains numerous advanced algorithms: graph and subgraph isomorphism; matching and flow problems; approximation algorithms for NP-hard problems such as independent set and TSP; and several more exotic algorithms such as Berge graph detection. Due to its versatility and generic design, JGraphT is currently used in large-scale commercial, non-commercial and academic research projects. In this work we describe in detail the design and underlying structure of the library, and discuss its most important features and algorithms. A computational study is conducted to evaluate the performance of JGraphT versus a number of similar libraries. Experiments on a large number of graphs over a variety of popular algorithms show that JGraphT is highly competitive with other established libraries such as NetworkX or the BGL.Comment: Major Revisio

    VPSPACE and a transfer theorem over the complex field

    Get PDF
    We extend the transfer theorem of [KP2007] to the complex field. That is, we investigate the links between the class VPSPACE of families of polynomials and the Blum-Shub-Smale model of computation over C. Roughly speaking, a family of polynomials is in VPSPACE if its coefficients can be computed in polynomial space. Our main result is that if (uniform, constant-free) VPSPACE families can be evaluated efficiently then the class PAR of decision problems that can be solved in parallel polynomial time over the complex field collapses to P. As a result, one must first be able to show that there are VPSPACE families which are hard to evaluate in order to separate P from NP over C, or even from PAR.Comment: 14 page

    Space efficient merging of de Bruijn graphs and Wheeler graphs

    Full text link
    The merging of succinct data structures is a well established technique for the space efficient construction of large succinct indexes. In the first part of the paper we propose a new algorithm for merging succinct representations of de Bruijn graphs. Our algorithm has the same asymptotic cost of the state of the art algorithm for the same problem but it uses less than half of its working space. A novel important feature of our algorithm, not found in any of the existing tools, is that it can compute the Variable Order succinct representation of the union graph within the same asymptotic time/space bounds. In the second part of the paper we consider the more general problem of merging succinct representations of Wheeler graphs, a recently introduced graph family which includes as special cases de Bruijn graphs and many other known succinct indexes based on the BWT or one of its variants. We show that Wheeler graphs merging is in general a much more difficult problem, and we provide a space efficient algorithm for the slightly simplified problem of determining whether the union graph has an ordering that satisfies the Wheeler conditions.Comment: 24 pages, 10 figures. arXiv admin note: text overlap with arXiv:1902.0288

    Logic and Automata

    Get PDF
    Mathematical logic and automata theory are two scientific disciplines with a fundamentally close relationship. The authors of Logic and Automata take the occasion of the sixtieth birthday of Wolfgang Thomas to present a tour d'horizon of automata theory and logic. The twenty papers in this volume cover many different facets of logic and automata theory, emphasizing the connections to other disciplines such as games, algorithms, and semigroup theory, as well as discussing current challenges in the field

    On Indexing and Compressing Finite Automata

    Full text link
    An index for a finite automaton is a powerful data structure that supports locating paths labeled with a query pattern, thus solving pattern matching on the underlying regular language. In this paper, we solve the long-standing problem of indexing arbitrary finite automata. Our solution consists in finding a partial co-lexicographic order of the states and proving, as in the total order case, that states reached by a given string form one interval on the partial order, thus enabling indexing. We provide a lower bound stating that such an interval requires O(p)O(p) words to be represented, pp being the order's width (i.e. the size of its largest antichain). Indeed, we show that pp determines the complexity of several fundamental problems on finite automata: (i) Letting σ\sigma be the alphabet size, we provide an encoding for NFAs using logσ+2logp+2\lceil\log \sigma\rceil + 2\lceil\log p\rceil + 2 bits per transition and a smaller encoding for DFAs using logσ+logp+2\lceil\log \sigma\rceil + \lceil\log p\rceil + 2 bits per transition. This is achieved by generalizing the Burrows-Wheeler transform to arbitrary automata. (ii) We show that indexed pattern matching can be solved in O~(mp2)\tilde O(m\cdot p^2) query time on NFAs. (iii) We provide a polynomial-time algorithm to index DFAs, while matching the optimal value for p p . On the other hand, we prove that the problem is NP-hard on NFAs. (iv) We show that, in the worst case, the classic powerset construction algorithm for NFA determinization generates an equivalent DFA of size 2p(np+1)12^p(n-p+1)-1, where nn is the number of NFA's states

    An Abstract Machine for Unification Grammars

    Full text link
    This work describes the design and implementation of an abstract machine, Amalia, for the linguistic formalism ALE, which is based on typed feature structures. This formalism is one of the most widely accepted in computational linguistics and has been used for designing grammars in various linguistic theories, most notably HPSG. Amalia is composed of data structures and a set of instructions, augmented by a compiler from the grammatical formalism to the abstract instructions, and a (portable) interpreter of the abstract instructions. The effect of each instruction is defined using a low-level language that can be executed on ordinary hardware. The advantages of the abstract machine approach are twofold. From a theoretical point of view, the abstract machine gives a well-defined operational semantics to the grammatical formalism. This ensures that grammars specified using our system are endowed with well defined meaning. It enables, for example, to formally verify the correctness of a compiler for HPSG, given an independent definition. From a practical point of view, Amalia is the first system that employs a direct compilation scheme for unification grammars that are based on typed feature structures. The use of amalia results in a much improved performance over existing systems. In order to test the machine on a realistic application, we have developed a small-scale, HPSG-based grammar for a fragment of the Hebrew language, using Amalia as the development platform. This is the first application of HPSG to a Semitic language.Comment: Doctoral Thesis, 96 pages, many postscript figures, uses pstricks, pst-node, psfig, fullname and a macros fil

    Order-Related Problems Parameterized by Width

    Get PDF
    In the main body of this thesis, we study two different order theoretic problems. The first problem, called Completion of an Ordering, asks to extend a given finite partial order to a complete linear order while respecting some weight constraints. The second problem is an order reconfiguration problem under width constraints. While the Completion of an Ordering problem is NP-complete, we show that it lies in FPT when parameterized by the interval width of ρ. This ordering problem can be used to model several ordering problems stemming from diverse application areas, such as graph drawing, computational social choice, and computer memory management. Each application yields a special partial order ρ. We also relate the interval width of ρ to parameterizations for these problems that have been studied earlier in the context of these applications, sometimes improving on parameterized algorithms that have been developed for these parameterizations before. This approach also gives some practical sub-exponential time algorithms for ordering problems. In our second main result, we combine our parameterized approach with the paradigm of solution diversity. The idea of solution diversity is that instead of aiming at the development of algorithms that output a single optimal solution, the goal is to investigate algorithms that output a small set of sufficiently good solutions that are sufficiently diverse from one another. In this way, the user has the opportunity to choose the solution that is most appropriate to the context at hand. It also displays the richness of the solution space. There, we show that the considered diversity version of the Completion of an Ordering problem is fixed-parameter tractable with respect to natural paramaters that capture the notion of diversity and the notion of sufficiently good solutions. We apply this algorithm in the study of the Kemeny Rank Aggregation class of problems, a well-studied class of problems lying in the intersection of order theory and social choice theory. Up to this point, we have been looking at problems where the goal is to find an optimal solution or a diverse set of good solutions. In the last part, we shift our focus from finding solutions to studying the solution space of a problem. There we consider the following order reconfiguration problem: Given a graph G together with linear orders τ and τ ′ of the vertices of G, can one transform τ into τ ′ by a sequence of swaps of adjacent elements in such a way that at each time step the resulting linear order has cutwidth (pathwidth) at most w? We show that this problem always has an affirmative answer when the input linear orders τ and τ ′ have cutwidth (pathwidth) at most w/2. Using this result, we establish a connection between two apparently unrelated problems: the reachability problem for two-letter string rewriting systems and the graph isomorphism problem for graphs of bounded cutwidth. This opens an avenue for the study of the famous graph isomorphism problem using techniques from term rewriting theory. In addition to the main part of this work, we present results on two unrelated problems, namely on the Steiner Tree problem and on the Intersection Non-emptiness problem from automata theory.Doktorgradsavhandlin
    corecore