14 research outputs found
Algorithms to Compute the Lyndon Array
We first describe three algorithms for computing the Lyndon array that have
been suggested in the literature, but for which no structured exposition has
been given. Two of these algorithms execute in quadratic time in the worst
case, the third achieves linear time, but at the expense of prior computation
of both the suffix array and the inverse suffix array of x. We then go on to
describe two variants of a new algorithm that avoids prior computation of
global data structures and executes in worst-case n log n time. Experimental
evidence suggests that all but one of these five algorithms require only linear
execution time in practice, with the two new algorithms faster by a small
factor. We conjecture that there exists a fast and worst-case linear-time
algorithm to compute the Lyndon array that is also elementary (making no use of
global data structures such as the suffix array)
Lyndon Array Construction during Burrows-Wheeler Inversion
In this paper we present an algorithm to compute the Lyndon array of a string
of length as a byproduct of the inversion of the Burrows-Wheeler
transform of . Our algorithm runs in linear time using only a stack in
addition to the data structures used for Burrows-Wheeler inversion. We compare
our algorithm with two other linear-time algorithms for Lyndon array
construction and show that computing the Burrows-Wheeler transform and then
constructing the Lyndon array is competitive compared to the known approaches.
We also propose a new balanced parenthesis representation for the Lyndon array
that uses bits of space and supports constant time access. This
representation can be built in linear time using words of space, or in
time using asymptotically the same space as
Longest Lyndon Substring After Edit
The longest Lyndon substring of a string T is the longest substring of T which is a Lyndon word. LLS(T) denotes the length of the longest Lyndon substring of a string T. In this paper, we consider computing LLS(T\u27) where T\u27 is an edited string formed from T. After O(n) time and space preprocessing, our algorithm returns LLS(T\u27) in O(log n) time for any single character edit. We also consider a version of the problem with block edits, i.e., a substring of T is replaced by a given string of length l. After O(n) time and space preprocessing, our algorithm returns LLS(T\u27) in O(l log sigma + log n) time for any block edit where sigma is the number of distinct characters in T. We can modify our algorithm so as to output all the longest Lyndon substrings of T\u27 for both problems
Inducing the Lyndon Array
In this paper we propose a variant of the induced suffix sorting algorithm by Nong (TOIS, 2013) that computes simultaneously the Lyndon array and the suffix array of a text in O(n) time using O(n) words of working space, where n is the length of the text and is the alphabet size. Our result improves the previous best space requirement for linear time computation of the Lyndon array. In fact, all the known linear algorithms for Lyndon array computation use suffix sorting as a preprocessing step and use O(n) words of working space in addition to the Lyndon array and suffix array. Experimental results with real and synthetic datasets show that our algorithm is not only space-efficient but also fast in practice
Lyndon Arrays Simplified
A Lyndon word is a string that is lexicographically smaller than all of its proper suffixes (e.g., "airbus" is a Lyndon word; "amtrak" is not a Lyndon word because its suffix "ak" is lexicographically smaller than "amtrak"). The Lyndon array (sometimes called Lyndon table) identifies the longest Lyndon prefix of each suffix of a string. It is well known that the Lyndon array of a length-n string can be computed in O(n) time. However, most of the existing algorithms require the suffix array, which has theoretical and practical disadvantages. The only known algorithms that compute the Lyndon array in O(n) time without the suffix array (or similar data structures) do so in a particularly space efficient way (Bille et al., ICALP 2020), or in an online manner (Badkobeh et al., CPM 2022). Due to the additional goals of space efficiency and online computation, these algorithms are complicated in technical detail. Using the main ideas of the aforementioned algorithms, we provide a simpler and easier to understand algorithm that computes the Lyndon array in O(n) time
Linear Time Runs Over General Ordered Alphabets
A run in a string is a maximal periodic substring. For example, the string
contains the runs
and . There are less than runs in any
length- string, and computing all runs for a string over a linearly-sortable
alphabet takes time (Bannai et al., SODA 2015). Kosolobov
conjectured that there also exists a linear time runs algorithm for general
ordered alphabets (Inf. Process. Lett. 2016). The conjecture was almost proven
by Crochemore et al., who presented an time algorithm
(where is the extremely slowly growing inverse Ackermann function).
We show how to achieve time by exploiting combinatorial
properties of the Lyndon array, thus proving Kosolobov's conjecture.Comment: This work has been submitted to ICALP 202