14,254 research outputs found
Repetition Detection in a Dynamic String
A string UU for a non-empty string U is called a square. Squares have been well-studied both from a combinatorial and an algorithmic perspective. In this paper, we are the first to consider the problem of maintaining a representation of the squares in a dynamic string S of length at most n. We present an algorithm that updates this representation in n^o(1) time. This representation allows us to report a longest square-substring of S in O(1) time and all square-substrings of S in O(output) time. We achieve this by introducing a novel tool - maintaining prefix-suffix matches of two dynamic strings.
We extend the above result to address the problem of maintaining a representation of all runs (maximal repetitions) of the string. Runs are known to capture the periodic structure of a string, and, as an application, we show that our representation of runs allows us to efficiently answer periodicity queries for substrings of a dynamic string. These queries have proven useful in static pattern matching problems and our techniques have the potential of offering solutions to these problems in a dynamic text setting
How many double squares can a string contain?
Counting the types of squares rather than their occurrences, we consider the
problem of bounding the number of distinct squares in a string. Fraenkel and
Simpson showed in 1998 that a string of length n contains at most 2n distinct
squares. Ilie presented in 2007 an asymptotic upper bound of 2n - Theta(log n).
We show that a string of length n contains at most 5n/3 distinct squares. This
new upper bound is obtained by investigating the combinatorial structure of
double squares and showing that a string of length n contains at most 2n/3
double squares. In addition, the established structural properties provide a
novel proof of Fraenkel and Simpson's result.Comment: 29 pages, 20 figure
On the Parikh-de-Bruijn grid
We introduce the Parikh-de-Bruijn grid, a graph whose vertices are
fixed-order Parikh vectors, and whose edges are given by a simple shift
operation. This graph gives structural insight into the nature of sets of
Parikh vectors as well as that of the Parikh set of a given string. We show its
utility by proving some results on Parikh-de-Bruijn strings, the abelian analog
of de-Bruijn sequences.Comment: 18 pages, 3 figures, 1 tabl
The Number of Repetitions in 2D-Strings
The notions of periodicity and repetitions in strings, and hence these of
runs and squares, naturally extend to two-dimensional strings. We consider two
types of repetitions in 2D-strings: 2D-runs and quartics (quartics are a
2D-version of squares in standard strings). Amir et al. introduced 2D-runs,
showed that there are of them in an 2D-string and
presented a simple construction giving a lower bound of for their
number (TCS 2020). We make a significant step towards closing the gap between
these bounds by showing that the number of 2D-runs in an 2D-string
is . In particular, our bound implies that the run-time of the algorithm of Amir et al. for computing
2D-runs is also . We expect this result to allow for
exploiting 2D-runs algorithmically in the area of 2D pattern matching.
A quartic is a 2D-string composed of identical blocks
(2D-strings) that was introduced by Apostolico and Brimkov (TCS 2000), where by
quartics they meant only primitively rooted quartics, i.e. built of a primitive
block. Here our notion of quartics is more general and analogous to that of
squares in 1D-strings. Apostolico and Brimkov showed that there are occurrences of primitively rooted quartics in an
2D-string and that this bound is attainable. Consequently the number of
distinct primitively rooted quartics is . Here, we prove that
the number of distinct general quartics is also . This extends
the rich combinatorial study of the number of distinct squares in a 1D-string,
that was initiated by Fraenkel and Simpson (J. Comb. Theory A 1998), to two
dimensions.
Finally, we show some algorithmic applications of 2D-runs. (Abstract
shortened due to arXiv requirements.)Comment: To appear in the ESA 2020 proceeding
Fast Algorithm for Partial Covers in Words
A factor of a word is a cover of if every position in lies
within some occurrence of in . A word covered by thus
generalizes the idea of a repetition, that is, a word composed of exact
concatenations of . In this article we introduce a new notion of
-partial cover, which can be viewed as a relaxed variant of cover, that
is, a factor covering at least positions in . We develop a data
structure of size (where ) that can be constructed in time which we apply to compute all shortest -partial covers for a
given . We also employ it for an -time algorithm computing
a shortest -partial cover for each
- …