Search CORE

469 research outputs found

Fast Parallel Lyndon Factorization With Applications

Author: Aposiolico Alberto
Crochemore Maxime
Publication venue: 'Purdue University (bepress)'
Publication date: 15/11/1989
Field of study

Sorting suffixes of a text via its Lyndon Factorization

Author: Mantaci Sabrina
Restivo Antonio
Rosone Giovanna
Sciortino Marinella
Publication venue
Publication date: 01/01/2013
Field of study

The process of sorting the suffixes of a text plays a fundamental role in Text Algorithms. They are used for instance in the constructions of the Burrows-Wheeler transform and the suffix array, widely used in several fields of Computer Science. For this reason, several recent researches have been devoted to finding new strategies to obtain effective methods for such a sorting. In this paper we introduce a new methodology in which an important role is played by the Lyndon factorization, so that the local suffixes inside factors detected by this factorization keep their mutual order when extended to the suffixes of the whole word. This property suggests a versatile technique that easily can be adapted to different implementative scenarios.Comment: Submitted to the Prague Stringology Conference 2013 (PSC 2013

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Università di Palermo

Minimal Suffix and Rotation of a Substring in Optimal Time

Author: Kociumaka Tomasz
Publication venue
Publication date: 01/01/2016
Field of study

For a text given in advance, the substring minimal suffix queries ask to determine the lexicographically minimal non-empty suffix of a substring specified by the location of its occurrence in the text. We develop a data structure answering such queries optimally: in constant time after linear-time preprocessing. This improves upon the results of Babenko et al. (CPM 2014), whose trade-off solution is characterized by

\Theta(n\log n)

product of these time complexities. Next, we extend our queries to support concatenations of

O(1)

substrings, for which the construction and query time is preserved. We apply these generalized queries to compute lexicographically minimal and maximal rotations of a given substring in constant time after linear-time preprocessing. Our data structures mainly rely on properties of Lyndon words and Lyndon factorizations. We combine them with further algorithmic and combinatorial tools, such as fusion trees and the notion of order isomorphism of strings

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Evaluation of a Permutation-Based Evolutionary Framework for Lyndon Factorizations

Author: Clare Amanda
Daykin Jacqueline
Major Lily
Mora Benjamin
Peña Gamboa Leo
Zarges Christine
Publication venue: Springer Nature
Publication date: 01/01/2020
Field of study

String factorization is an important tool for partitioning data for parallel processing and other algorithmic techniques often found in the context of big data applications such as bioinformatics or compression. Duval’s well-known algorithm uniquely factors a string over an ordered alphabet into Lyndon words, i.e., patterned strings which arestrictly smaller than all of their cyclic rotations. While Duval’s algorithm produces a pre-determined factorization, modern applications motivate the demand for factorizations with specific properties, e.g., those that minimize the number of factors or consist of factors with similar lengths. In this paper, we consider the problem of finding an alphabet ordering that yields a Lyndon factorization with such properties. We introduce a flexible evolutionary framework and evaluate it on biological sequence data. For the minimization case, we also propose a new problem-specific heuristic, Flexi-Duval, and a problem-specific mutation operator for Lyndon factorization. Our results show that our framework is competitive with Flexi-Duval for minimization and yields high quality and robust solutions for balancing where no problem-specific algorithm is available

Aberystwyth Research Portal

Cronfa at Swansea University

Fast Computation of Abelian Runs

Author: Fici Gabriele
Kociumaka Tomasz
Lecroq Thierry
Lefebvre Arnaud
Prieur-Gaston Elise
Publication venue: 'Elsevier BV'
Publication date: 22/12/2015
Field of study

Given a word

w

and a Parikh vector

\mathcal{P}

, an abelian run of period

\mathcal{P}

w

is a maximal occurrence of a substring of

w

having abelian period

\mathcal{P}

. Our main result is an online algorithm that, given a word

w

of length

n

over an alphabet of cardinality

\sigma

and a Parikh vector

\mathcal{P}

, returns all the abelian runs of period

\mathcal{P}

w

in time

O(n)

and space

O(\sigma+p)

, where

p

is the norm of

\mathcal{P}

, i.e., the sum of its components. We also present an online algorithm that computes all the abelian runs with periods of norm

p

w

in time

O(np)

, for any given norm

p

. Finally, we give an

O(n^2)

-time offline randomized algorithm for computing all the abelian runs of

w

. Its deterministic counterpart runs in

O(n^2\log\sigma)

time.Comment: To appear in Theoretical Computer Scienc

arXiv.org e-Print Archive

HAL - Normandie Université

Archivio istituzionale della ricerca - Università di Palermo

Longest Lyndon Substring After Edit

Author: Bannai Hideo
Inenaga Shunsuke
Nakashima Yuto
Takeda Masayuki
Urabe Yuki
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. Annual Symposium on Combinatorial Pattern Matching (CPM 2018)
Publication date: 01/01/2018
Field of study

The longest Lyndon substring of a string T is the longest substring of T which is a Lyndon word. LLS(T) denotes the length of the longest Lyndon substring of a string T. In this paper, we consider computing LLS(T\u27) where T\u27 is an edited string formed from T. After O(n) time and space preprocessing, our algorithm returns LLS(T\u27) in O(log n) time for any single character edit. We also consider a version of the problem with block edits, i.e., a substring of T is replaced by a given string of length l. After O(n) time and space preprocessing, our algorithm returns LLS(T\u27) in O(l log sigma + log n) time for any block edit where sigma is the number of distinct characters in T. We can modify our algorithm so as to output all the longest Lyndon substrings of T\u27 for both problems

Dagstuhl Research Online Publication Server