403 research outputs found
Quasiperiodicities in Fibonacci strings
We consider the problem of finding quasiperiodicities in a Fibonacci string.
A factor u of a string y is a cover of y if every letter of y falls within some
occurrence of u in y. A string v is a seed of y, if it is a cover of a
superstring of y. A left seed of a string y is a prefix of y that it is a cover
of a superstring of y. Similarly a right seed of a string y is a suffix of y
that it is a cover of a superstring of y. In this paper, we present some
interesting results regarding quasiperiodicities in Fibonacci strings, we
identify all covers, left/right seeds and seeds of a Fibonacci string and all
covers of a circular Fibonacci string.Comment: In Local Proceedings of "The 38th International Conference on Current
Trends in Theory and Practice of Computer Science" (SOFSEM 2012
Identifying all abelian periods of a string in quadratic time and relevant problems
Abelian periodicity of strings has been studied extensively over the last
years. In 2006 Constantinescu and Ilie defined the abelian period of a string
and several algorithms for the computation of all abelian periods of a string
were given. In contrast to the classical period of a word, its abelian version
is more flexible, factors of the word are considered the same under any
internal permutation of their letters. We show two O(|y|^2) algorithms for the
computation of all abelian periods of a string y. The first one maps each
letter to a suitable number such that each factor of the string can be
identified by the unique sum of the numbers corresponding to its letters and
hence abelian periods can be identified easily. The other one maps each letter
to a prime number such that each factor of the string can be identified by the
unique product of the numbers corresponding to its letters and so abelian
periods can be identified easily. We also define weak abelian periods on
strings and give an O(|y|log(|y|)) algorithm for their computation, together
with some other algorithms for more basic problems.Comment: Accepted in the "International Journal of foundations of Computer
Science
Truly Subquadratic-Time Extension Queries and Periodicity Detection in Strings with Uncertainties
Strings with don\u27t care symbols, also called partial words, and more general indeterminate strings are a natural representation of strings containing uncertain symbols. A considerable effort has been made to obtain efficient algorithms for pattern matching and periodicity detection in such strings. Among those, a number of algorithms have been proposed that behave well on random data, but still their worst-case running time is Theta(n^2). We present the first truly subquadratic-time solutions for a number of such problems on partial words that can also be adapted to indeterminate strings over a constant-sized alphabet. We show that longest common compatible prefix queries (which correspond to longest common extension queries in regular strings) can be answered on-line in O(n * sqrt(n * log(n)) time after O(n * sqrt(n * log(n))-time preprocessing. We also present O(n * sqrt(n * log(n))-time algorithms for computing the prefix array and two types of border array of a partial word
Algorithms for Longest Common Abelian Factors
In this paper we consider the problem of computing the longest common abelian
factor (LCAF) between two given strings. We present a simple
time algorithm, where is the length of the strings and is the
alphabet size, and a sub-quadratic running time solution for the binary string
case, both having linear space requirement. Furthermore, we present a modified
algorithm applying some interesting tricks and experimentally show that the
resulting algorithm runs faster.Comment: 13 pages, 4 figure
Linear Algorithm for Conservative Degenerate Pattern Matching
A degenerate symbol x* over an alphabet A is a non-empty subset of A, and a
sequence of such symbols is a degenerate string. A degenerate string is said to
be conservative if its number of non-solid symbols is upper-bounded by a fixed
positive constant k. We consider here the matching problem of conservative
degenerate strings and present the first linear-time algorithm that can find,
for given degenerate strings P* and T* of total length n containing k non-solid
symbols in total, the occurrences of P* in T* in O(nk) time
Faster algorithms for computing maximal multirepeats in multiple sequences
A repeat in a string is a substring that occurs more than once. A repeat is extendible if every occurrence of the repeat has an identical letter either on the left or on the right; otherwise, it is maximal. A multirepeat is a repeat that occurs at least mmin times (mmin greater than/equal to 2) in each of at least q greater than/equal to 1 strings in a given set of strings. In this paper, we describe a family of efficient algorithms based on suffix arrays to compute maximal multirepeats under various constraints. Our algorithms are faster, more flexible and much more space-efficient than algorithms recently proposed for this problem. The results extend recent work by two of the authors computing all maximal repeats in a single string
- …