47 research outputs found

    A New Class of Searchable and Provably Highly Compressible String Transformations

    Get PDF
    The Burrows-Wheeler Transform is a string transformation that plays a fundamental role for the design of self-indexing compressed data structures. Over the years, researchers have successfully extended this transformation outside the domains of strings. However, efforts to find non-trivial alternatives of the original, now 25 years old, Burrows-Wheeler string transformation have met limited success. In this paper we bring new lymph to this area by introducing a whole new family of transformations that have all the "myriad virtues" of the BWT: they can be computed and inverted in linear time, they produce provably highly compressible strings, and they support linear time pattern search directly on the transformed string. This new family is a special case of a more general class of transformations based on context adaptive alphabet orderings, a concept introduced here. This more general class includes also the Alternating BWT, another invertible string transforms recently introduced in connection with a generalization of Lyndon words

    Space Efficient Construction of Lyndon Arrays in Linear Time

    Get PDF

    Lyndon Array construction during Burrows-Wheeler inversion

    Get PDF
    In this paper we present an algorithm to compute the Lyndon array of a string T of length n as a byproduct of the inversion of the Burrows-Wheeler transform of T. Our algorithm runs in linear time using only a stack in addition to the data structures used for Burrows-Wheeler inversion. We compare our algorithm with two other linear-time algorithms for Lyndon array construction and show that computing the Burrows-Wheeler transform and then constructing the Lyndon array is competitive compared to the known approaches. We also propose a new balanced parenthesis representation for the Lyndon array that uses bits of space and supports constant time access. This representation can be built in linear time using words of space, or in time using asymptotically the same space as T

    LIPIcs, Volume 244, ESA 2022, Complete Volume

    Get PDF
    LIPIcs, Volume 244, ESA 2022, Complete Volum
    corecore