377 research outputs found
String Comparison in -Order: New Lexicographic Properties & On-line Applications
-order is a global order on strings related to Unique Maximal
Factorization Families (UMFFs), which are themselves generalizations of Lyndon
words. -order has recently been proposed as an alternative to
lexicographical order in the computation of suffix arrays and in the
suffix-sorting induced by the Burrows-Wheeler transform. Efficient -ordering
of strings thus becomes a matter of considerable interest. In this paper we
present new and surprising results on -order in strings, then go on to
explore the algorithmic consequences
The Word Problem for Omega-Terms over the Trotter-Weil Hierarchy
For two given -terms and , the word problem for
-terms over a variety asks whether
in all monoids in . We show that the
word problem for -terms over each level of the Trotter-Weil Hierarchy
is decidable. More precisely, for every fixed variety in the Trotter-Weil
Hierarchy, our approach yields an algorithm in nondeterministic logarithmic
space (NL). In addition, we provide deterministic polynomial time algorithms
which are more efficient than straightforward translations of the
NL-algorithms. As an application of our results, we show that separability by
the so-called corners of the Trotter-Weil Hierarchy is witnessed by
-terms (this property is also known as -reducibility). In
particular, the separation problem for the corners of the Trotter-Weil
Hierarchy is decidable
Varieties of Languages in a Category
Eilenberg's variety theorem, a centerpiece of algebraic automata theory,
establishes a bijective correspondence between varieties of languages and
pseudovarieties of monoids. In the present paper this result is generalized to
an abstract pair of algebraic categories: we introduce varieties of languages
in a category C, and prove that they correspond to pseudovarieties of monoids
in a closed monoidal category D, provided that C and D are dual on the level of
finite objects. By suitable choices of these categories our result uniformly
covers Eilenberg's theorem and three variants due to Pin, Polak and Reutenauer,
respectively, and yields new Eilenberg-type correspondences
Transitive Hall sets
We give the definition of Lazard and Hall sets in the context of transitive
factorizations of free monoids. The equivalence of the two properties is
proved. This allows to build new effective bases of free partially commutative
Lie algebras. The commutation graphs for which such sets exist are completely
characterized and we explicit, in this context, the classical PBW rewriting
process
Lightweight Lempel-Ziv Parsing
We introduce a new approach to LZ77 factorization that uses O(n/d) words of
working space and O(dn) time for any d >= 1 (for polylogarithmic alphabet
sizes). We also describe carefully engineered implementations of alternative
approaches to lightweight LZ77 factorization. Extensive experiments show that
the new algorithm is superior in most cases, particularly at the lowest memory
levels and for highly repetitive data. As a part of the algorithm, we describe
new methods for computing matching statistics which may be of independent
interest.Comment: 12 page
Algorithms to Compute the Lyndon Array
We first describe three algorithms for computing the Lyndon array that have
been suggested in the literature, but for which no structured exposition has
been given. Two of these algorithms execute in quadratic time in the worst
case, the third achieves linear time, but at the expense of prior computation
of both the suffix array and the inverse suffix array of x. We then go on to
describe two variants of a new algorithm that avoids prior computation of
global data structures and executes in worst-case n log n time. Experimental
evidence suggests that all but one of these five algorithms require only linear
execution time in practice, with the two new algorithms faster by a small
factor. We conjecture that there exists a fast and worst-case linear-time
algorithm to compute the Lyndon array that is also elementary (making no use of
global data structures such as the suffix array)
Factorizing a String into Squares in Linear Time
A square factorization of a string w is a factorization of w in which each factor is a square. Dumitran et al. [SPIRE 2015, pp. 54-66] showed how to find a square factorization of a given string of length n in O(n log n) time, and they posed a question whether it can be done in O(n) time. In this paper, we answer their question positively, showing an O(n)-time algorithm for square factorization in the standard word RAM model with machine word size omega = Omega(log n). We also show an O(n + (n log^2 n) / omega)-time (respectively, O(n log n)-time) algorithm to find a square factorization which contains the maximum (respectively, minimum) number of squares
- …