40 research outputs found

    Comparison of LZ77-type Parsings

    Full text link
    We investigate the relations between different variants of the LZ77 parsing existing in the literature. All of them are defined as greedily constructed parsings encoding each phrase by reference to a string occurring earlier in the input. They differ by the phrase encodings: encoded by pairs (length + position of an earlier occurrence) or by triples (length + position of an earlier occurrence + the letter following the earlier occurring part); and they differ by allowing or not allowing overlaps between the phrase and its earlier occurrence. For a given string of length nn over an alphabet of size σ\sigma, denote the numbers of phrases in the parsings allowing (resp., not allowing) overlaps by zz (resp., z^\hat{z}) for "pairs", and by z3z_3 (resp., z^3\hat{z}_3) for "triples". We prove the following bounds and provide series of examples showing that these bounds are tight: \bullet zz^zO(lognzlogσz)z \le \hat{z} \le z \cdot O(\log\frac{n}{z\log_\sigma z}) and z3z^3z3O(lognz3logσz3)z_3 \le \hat{z}_3 \le z_3 \cdot O(\log\frac{n}{z_3\log_\sigma z_3}); \bullet 12z^<z^3z^\frac{1}2\hat{z} < \hat{z}_3 \le \hat{z} and 12z<z3z\frac{1}2 z < z_3 \le z.Comment: 6 page

    EERTREE: An Efficient Data Structure for Processing Palindromes in Strings

    Full text link
    We propose a new linear-size data structure which provides a fast access to all palindromic substrings of a string or a set of strings. This structure inherits some ideas from the construction of both the suffix trie and suffix tree. Using this structure, we present simple and efficient solutions for a number of problems involving palindromes.Comment: 21 pages, 2 figures. Accepted to IWOCA 201

    On the Combinatorics of Palindromes and Antipalindromes

    Full text link
    We prove a number of results on the structure and enumeration of palindromes and antipalindromes. In particular, we study conjugates of palindromes, palindromic pairs, rich words, and the counterparts of these notions for antipalindromes.Comment: 13 pages/ submitted to DLT 201

    Subword complexity and power avoidance

    Get PDF
    We begin a systematic study of the relations between subword complexity of infinite words and their power avoidance. Among other things, we show that -- the Thue-Morse word has the minimum possible subword complexity over all overlap-free binary words and all (73)(\frac 73)-power-free binary words, but not over all (73)+(\frac 73)^+-power-free binary words; -- the twisted Thue-Morse word has the maximum possible subword complexity over all overlap-free binary words, but no word has the maximum subword complexity over all (73)(\frac 73)-power-free binary words; -- if some word attains the minimum possible subword complexity over all square-free ternary words, then one such word is the ternary Thue word; -- the recently constructed 1-2-bonacci word has the minimum possible subword complexity over all \textit{symmetric} square-free ternary words.Comment: 29 pages. Submitted to TC

    Searching Long Repeats in Streams

    Get PDF
    We consider two well-known related problems: Longest Repeated Substring (LRS) and Longest Repeated Reversed Substring (LRRS). Their streaming versions cannot be solved exactly; we show that only approximate solutions by Monte Carlo algorithms are possible, and prove a lower bound on consumed memory. For both problems, we present purely linear-time Monte Carlo algorithms working in O(E + n/E) space, where E is the additive approximation error. Within the same space bounds, we then present nearly real-time solutions, which require O(log n) time per symbol and O(n + n/E log n) time overall. The working space exactly matches the lower bound whenever E=O(n^{0.5}) and the size of the alphabet is Omega(n^{0.01})

    Palindromic k-Factorization in Pure Linear Time

    Get PDF
    Given a string s of length n over a general alphabet and an integer k, the problem is to decide whether s is a concatenation of k nonempty palindromes. Two previously known solutions for this problem work in time O(kn) and O(nlog n) respectively. Here we settle the complexity of this problem in the word-RAM model, presenting an O(n)-time online deciding algorithm. The algorithm simultaneously finds the minimum odd number of factors and the minimum even number of factors in a factorization of a string into nonempty palindromes. We also demonstrate how to get an explicit factorization of s into k palindromes with an O(n)-time offline postprocessing

    Binary Patterns in Binary Cube-Free Words: Avoidability and Growth

    Get PDF
    The avoidability of binary patterns by binary cube-free words is investigated and the exact bound between unavoidable and avoidable patterns is found. All avoidable patterns are shown to be D0L-avoidable. For avoidable patterns, the growth rates of the avoiding languages are studied. All such languages, except for the overlap-free language, are proved to have exponential growth. The exact growth rates of languages avoiding minimal avoidable patterns are approximated through computer-assisted upper bounds. Finally, a new example of a pattern-avoiding language of polynomial growth is given.Comment: 18 pages, 2 tables; submitted to RAIRO TIA (Special issue of Mons Days 2012
    corecore