9 research outputs found
An output-sensitive algorithm for the minimization of 2-dimensional String Covers
String covers are a powerful tool for analyzing the quasi-periodicity of
1-dimensional data and find applications in automata theory, computational
biology, coding and the analysis of transactional data. A \emph{cover} of a
string is a string for which every letter of lies within some
occurrence of . String covers have been generalized in many ways, leading to
\emph{k-covers}, \emph{-covers}, \emph{approximate covers} and were
studied in different contexts such as \emph{indeterminate strings}.
In this paper we generalize string covers to the context of 2-dimensional
data, such as images. We show how they can be used for the extraction of
textures from images and identification of primitive cells in lattice data.
This has interesting applications in image compression, procedural terrain
generation and crystallography
Identifying all abelian periods of a string in quadratic time and relevant problems
Abelian periodicity of strings has been studied extensively over the last
years. In 2006 Constantinescu and Ilie defined the abelian period of a string
and several algorithms for the computation of all abelian periods of a string
were given. In contrast to the classical period of a word, its abelian version
is more flexible, factors of the word are considered the same under any
internal permutation of their letters. We show two O(|y|^2) algorithms for the
computation of all abelian periods of a string y. The first one maps each
letter to a suitable number such that each factor of the string can be
identified by the unique sum of the numbers corresponding to its letters and
hence abelian periods can be identified easily. The other one maps each letter
to a prime number such that each factor of the string can be identified by the
unique product of the numbers corresponding to its letters and so abelian
periods can be identified easily. We also define weak abelian periods on
strings and give an O(|y|log(|y|)) algorithm for their computation, together
with some other algorithms for more basic problems.Comment: Accepted in the "International Journal of foundations of Computer
Science
Efficient Seeds Computation Revisited
The notion of the cover is a generalization of a period of a string, and
there are linear time algorithms for finding the shortest cover. The seed is a
more complicated generalization of periodicity, it is a cover of a superstring
of a given string, and the shortest seed problem is of much higher algorithmic
difficulty. The problem is not well understood, no linear time algorithm is
known. In the paper we give linear time algorithms for some of its versions ---
computing shortest left-seed array, longest left-seed array and checking for
seeds of a given length. The algorithm for the last problem is used to compute
the seed array of a string (i.e., the shortest seeds for all the prefixes of
the string) in time. We describe also a simpler alternative algorithm
computing efficiently the shortest seeds. As a by-product we obtain an
time algorithm checking if the shortest seed has length at
least and finding the corresponding seed. We also correct some important
details missing in the previously known shortest-seed algorithm (Iliopoulos et
al., 1996).Comment: 14 pages, accepted to CPM 201
Quasi-Periodicity Under Mismatch Errors
Tracing regularities plays a key role in data analysis for various areas of science, including coding and automata theory, formal language theory, combinatorics, molecular biology and many others. Part of the scientific process is understanding and explaining these regularities. A common notion to describe regularity in a string T is a cover or quasi-period, which is a string C for which every letter of T lies within some occurrence of C. In many applications finding exact repetitions is not sufficient, due to the presence of errors. In this paper we initiate the study of quasi-periodicity persistence under mismatch errors, and our goal is to characterize situations where a given quasi-periodic string remains quasi-periodic even after substitution errors have been introduced to the string. Our study results in proving necessary conditions as well as a theorem stating sufficient conditions for quasi-periodicity persistence. As an application, we are able to close the gap in understanding the complexity of Approximate Cover Problem (ACP) relaxations studied by [Amir 2017a, Amir 2017b] and solve an open question