3 research outputs found

    String Comparison in VV-Order: New Lexicographic Properties & On-line Applications

    Get PDF
    VV-order is a global order on strings related to Unique Maximal Factorization Families (UMFFs), which are themselves generalizations of Lyndon words. VV-order has recently been proposed as an alternative to lexicographical order in the computation of suffix arrays and in the suffix-sorting induced by the Burrows-Wheeler transform. Efficient VV-ordering of strings thus becomes a matter of considerable interest. In this paper we present new and surprising results on VV-order in strings, then go on to explore the algorithmic consequences

    Combinatorics of unique maximal factorization families (UMFFs)

    Get PDF
    Suppose a set W of strings contains exactly one rotation (cyclic shift) of every primitive string on some alphabet Σ. Then W is a circ-UMFF if and only if every word in Σ+ has a unique maximal factorization over W. The classic circ-UMFF is the set of Lyndon words based on lexicographic ordering (1958). Duval (1983) designed a linear sequential Lyndon factorization algorithm; a corresponding PRAM parallel algorithm was described by J. Daykin, Iliopoulos and Smyth (1994). Daykin and Daykin defined new circ-UMFFs based on various methods for totally ordering sets of strings (2003), and further described the structure of all circ-UMFFs (2008). Here we prove new combinatorial results for circ-UMFFs, and in particular for the case of Lyndon words. We introduce Acrobat and Flight Deck circ-UMFFs, and describe some of our results in terms of dictionaries. Applications of circ-UMFFs pertain to structured methods for concatenating and factoring strings over ordered alphabets, and those of Lyndon words are wide ranging and multidisciplinary

    Combinatorics of unique maximal factorization families (UMFFs)

    Get PDF
    Suppose a set W of strings contains exactly one rotation (cyclic shift) of every primitive string on some alphabet Σ. Then W is a circ-UMFF if and only if every word in Σ+^+ has a unique maximal factorization over W. The classic circ-UMFF is the set of Lyndon words based on lexicographic ordering (1958). Duval (1983) designed a linear sequential Lyndon factorization algorithm; a corresponding PRAMparallel algorithmwas described by J. Daykin, Iliopoulos and Smyth (1994). Daykin and Daykin defined new circ-UMFFs based on various methods for totally ordering sets of strings (2003), and further described the structure of all circ-UMFFs (2008). Here we prove new combinatorial results for circ-UMFFs, and in particular for the case of Lyndon words. We introduce Acrobat and Flight Deck circ-UMFFs, and describe some of our results in terms of dictionaries. Applications of circ-UMFFs pertain to structured methods for concatenating and factoring strings over ordered alphabets, and those of Lyndon words are wide ranging and multidisciplinary
    corecore