377 research outputs found

    String Comparison in VV-Order: New Lexicographic Properties & On-line Applications

    Get PDF
    VV-order is a global order on strings related to Unique Maximal Factorization Families (UMFFs), which are themselves generalizations of Lyndon words. VV-order has recently been proposed as an alternative to lexicographical order in the computation of suffix arrays and in the suffix-sorting induced by the Burrows-Wheeler transform. Efficient VV-ordering of strings thus becomes a matter of considerable interest. In this paper we present new and surprising results on VV-order in strings, then go on to explore the algorithmic consequences

    The Word Problem for Omega-Terms over the Trotter-Weil Hierarchy

    Get PDF
    For two given ω\omega-terms α\alpha and β\beta, the word problem for ω\omega-terms over a variety V\boldsymbol{\mathrm{V}} asks whether α=β\alpha=\beta in all monoids in V\boldsymbol{\mathrm{V}}. We show that the word problem for ω\omega-terms over each level of the Trotter-Weil Hierarchy is decidable. More precisely, for every fixed variety in the Trotter-Weil Hierarchy, our approach yields an algorithm in nondeterministic logarithmic space (NL). In addition, we provide deterministic polynomial time algorithms which are more efficient than straightforward translations of the NL-algorithms. As an application of our results, we show that separability by the so-called corners of the Trotter-Weil Hierarchy is witnessed by ω\omega-terms (this property is also known as ω\omega-reducibility). In particular, the separation problem for the corners of the Trotter-Weil Hierarchy is decidable

    Varieties of Languages in a Category

    Full text link
    Eilenberg's variety theorem, a centerpiece of algebraic automata theory, establishes a bijective correspondence between varieties of languages and pseudovarieties of monoids. In the present paper this result is generalized to an abstract pair of algebraic categories: we introduce varieties of languages in a category C, and prove that they correspond to pseudovarieties of monoids in a closed monoidal category D, provided that C and D are dual on the level of finite objects. By suitable choices of these categories our result uniformly covers Eilenberg's theorem and three variants due to Pin, Polak and Reutenauer, respectively, and yields new Eilenberg-type correspondences

    Transitive Hall sets

    Get PDF
    We give the definition of Lazard and Hall sets in the context of transitive factorizations of free monoids. The equivalence of the two properties is proved. This allows to build new effective bases of free partially commutative Lie algebras. The commutation graphs for which such sets exist are completely characterized and we explicit, in this context, the classical PBW rewriting process

    Lightweight Lempel-Ziv Parsing

    Full text link
    We introduce a new approach to LZ77 factorization that uses O(n/d) words of working space and O(dn) time for any d >= 1 (for polylogarithmic alphabet sizes). We also describe carefully engineered implementations of alternative approaches to lightweight LZ77 factorization. Extensive experiments show that the new algorithm is superior in most cases, particularly at the lowest memory levels and for highly repetitive data. As a part of the algorithm, we describe new methods for computing matching statistics which may be of independent interest.Comment: 12 page

    Algorithms to Compute the Lyndon Array

    Get PDF
    We first describe three algorithms for computing the Lyndon array that have been suggested in the literature, but for which no structured exposition has been given. Two of these algorithms execute in quadratic time in the worst case, the third achieves linear time, but at the expense of prior computation of both the suffix array and the inverse suffix array of x. We then go on to describe two variants of a new algorithm that avoids prior computation of global data structures and executes in worst-case n log n time. Experimental evidence suggests that all but one of these five algorithms require only linear execution time in practice, with the two new algorithms faster by a small factor. We conjecture that there exists a fast and worst-case linear-time algorithm to compute the Lyndon array that is also elementary (making no use of global data structures such as the suffix array)

    Factorizing a String into Squares in Linear Time

    Get PDF
    A square factorization of a string w is a factorization of w in which each factor is a square. Dumitran et al. [SPIRE 2015, pp. 54-66] showed how to find a square factorization of a given string of length n in O(n log n) time, and they posed a question whether it can be done in O(n) time. In this paper, we answer their question positively, showing an O(n)-time algorithm for square factorization in the standard word RAM model with machine word size omega = Omega(log n). We also show an O(n + (n log^2 n) / omega)-time (respectively, O(n log n)-time) algorithm to find a square factorization which contains the maximum (respectively, minimum) number of squares
    corecore