14,254 research outputs found

    Repetition Detection in a Dynamic String

    Get PDF
    A string UU for a non-empty string U is called a square. Squares have been well-studied both from a combinatorial and an algorithmic perspective. In this paper, we are the first to consider the problem of maintaining a representation of the squares in a dynamic string S of length at most n. We present an algorithm that updates this representation in n^o(1) time. This representation allows us to report a longest square-substring of S in O(1) time and all square-substrings of S in O(output) time. We achieve this by introducing a novel tool - maintaining prefix-suffix matches of two dynamic strings. We extend the above result to address the problem of maintaining a representation of all runs (maximal repetitions) of the string. Runs are known to capture the periodic structure of a string, and, as an application, we show that our representation of runs allows us to efficiently answer periodicity queries for substrings of a dynamic string. These queries have proven useful in static pattern matching problems and our techniques have the potential of offering solutions to these problems in a dynamic text setting

    How many double squares can a string contain?

    Full text link
    Counting the types of squares rather than their occurrences, we consider the problem of bounding the number of distinct squares in a string. Fraenkel and Simpson showed in 1998 that a string of length n contains at most 2n distinct squares. Ilie presented in 2007 an asymptotic upper bound of 2n - Theta(log n). We show that a string of length n contains at most 5n/3 distinct squares. This new upper bound is obtained by investigating the combinatorial structure of double squares and showing that a string of length n contains at most 2n/3 double squares. In addition, the established structural properties provide a novel proof of Fraenkel and Simpson's result.Comment: 29 pages, 20 figure

    On the Parikh-de-Bruijn grid

    Full text link
    We introduce the Parikh-de-Bruijn grid, a graph whose vertices are fixed-order Parikh vectors, and whose edges are given by a simple shift operation. This graph gives structural insight into the nature of sets of Parikh vectors as well as that of the Parikh set of a given string. We show its utility by proving some results on Parikh-de-Bruijn strings, the abelian analog of de-Bruijn sequences.Comment: 18 pages, 3 figures, 1 tabl

    The Number of Repetitions in 2D-Strings

    Get PDF
    The notions of periodicity and repetitions in strings, and hence these of runs and squares, naturally extend to two-dimensional strings. We consider two types of repetitions in 2D-strings: 2D-runs and quartics (quartics are a 2D-version of squares in standard strings). Amir et al. introduced 2D-runs, showed that there are O(n3)O(n^3) of them in an n×nn \times n 2D-string and presented a simple construction giving a lower bound of Ω(n2)\Omega(n^2) for their number (TCS 2020). We make a significant step towards closing the gap between these bounds by showing that the number of 2D-runs in an n×nn \times n 2D-string is O(n2log2n)O(n^2 \log^2 n). In particular, our bound implies that the O(n2logn+output)O(n^2\log n + \textsf{output}) run-time of the algorithm of Amir et al. for computing 2D-runs is also O(n2log2n)O(n^2 \log^2 n). We expect this result to allow for exploiting 2D-runs algorithmically in the area of 2D pattern matching. A quartic is a 2D-string composed of 2×22 \times 2 identical blocks (2D-strings) that was introduced by Apostolico and Brimkov (TCS 2000), where by quartics they meant only primitively rooted quartics, i.e. built of a primitive block. Here our notion of quartics is more general and analogous to that of squares in 1D-strings. Apostolico and Brimkov showed that there are O(n2log2n)O(n^2 \log^2 n) occurrences of primitively rooted quartics in an n×nn \times n 2D-string and that this bound is attainable. Consequently the number of distinct primitively rooted quartics is O(n2log2n)O(n^2 \log^2 n). Here, we prove that the number of distinct general quartics is also O(n2log2n)O(n^2 \log^2 n). This extends the rich combinatorial study of the number of distinct squares in a 1D-string, that was initiated by Fraenkel and Simpson (J. Comb. Theory A 1998), to two dimensions. Finally, we show some algorithmic applications of 2D-runs. (Abstract shortened due to arXiv requirements.)Comment: To appear in the ESA 2020 proceeding

    Fast Algorithm for Partial Covers in Words

    Get PDF
    A factor uu of a word ww is a cover of ww if every position in ww lies within some occurrence of uu in ww. A word ww covered by uu thus generalizes the idea of a repetition, that is, a word composed of exact concatenations of uu. In this article we introduce a new notion of α\alpha-partial cover, which can be viewed as a relaxed variant of cover, that is, a factor covering at least α\alpha positions in ww. We develop a data structure of O(n)O(n) size (where n=wn=|w|) that can be constructed in O(nlogn)O(n\log n) time which we apply to compute all shortest α\alpha-partial covers for a given α\alpha. We also employ it for an O(nlogn)O(n\log n)-time algorithm computing a shortest α\alpha-partial cover for each α=1,2,,n\alpha=1,2,\ldots,n
    corecore