164 research outputs found

    On the Impact of Morphisms on BWT-Runs

    Get PDF
    Morphisms are widely studied combinatorial objects that can be used for generating infinite families of words. In the context of Information theory, injective morphisms are called (variable length) codes. In Data compression, the morphisms, combined with parsing techniques, have been recently used to define new mechanisms to generate repetitive words. Here, we show that the repetitiveness induced by applying a morphism to a word can be captured by a compression scheme based on the Burrows-Wheeler Transform (BWT). In fact, we prove that, differently from other compression-based repetitiveness measures, the measure r_bwt (which counts the number of equal-letter runs produced by applying BWT to a word) strongly depends on the applied morphism. More in detail, we characterize the binary morphisms that preserve the value of r_bwt(w), when applied to any binary word w containing both letters. They are precisely the Sturmian morphisms, which are well-known objects in Combinatorics on words. Moreover, we prove that it is always possible to find a binary morphism that, when applied to any binary word containing both letters, increases the number of BWT-equal letter runs by a given (even) number. In addition, we derive a method for constructing arbitrarily large families of binary words on which BWT produces a given (even) number of new equal-letter runs. Such results are obtained by using a new class of morphisms that we call Thue-Morse-like. Finally, we show that there exist binary morphisms μ for which it is possible to find words w such that the difference r_bwt(μ(w))-r_bwt(w) is arbitrarily large

    A Characterization of Infinite LSP Words

    Full text link
    G. Fici proved that a finite word has a minimal suffix automaton if and only if all its left special factors occur as prefixes. He called LSP all finite and infinite words having this latter property. We characterize here infinite LSP words in terms of SS-adicity. More precisely we provide a finite set of morphisms SS and an automaton A{\cal A} such that an infinite word is LSP if and only if it is SS-adic and all its directive words are recognizable by A{\cal A}

    Palindromic Length of Words with Many Periodic Palindromes

    Full text link
    The palindromic length PL(v)\text{PL}(v) of a finite word vv is the minimal number of palindromes whose concatenation is equal to vv. In 2013, Frid, Puzynina, and Zamboni conjectured that: If ww is an infinite word and kk is an integer such that PL(u)≤k\text{PL}(u)\leq k for every factor uu of ww then ww is ultimately periodic. Suppose that ww is an infinite word and kk is an integer such PL(u)≤k\text{PL}(u)\leq k for every factor uu of ww. Let Ω(w,k)\Omega(w,k) be the set of all factors uu of ww that have more than k−1∣u∣k\sqrt[k]{k^{-1}\vert u\vert} palindromic prefixes. We show that Ω(w,k)\Omega(w,k) is an infinite set and we show that for each positive integer jj there are palindromes a,ba,b and a word u∈Ω(w,k)u\in \Omega(w,k) such that (ab)j(ab)^j is a factor of uu and bb is nonempty. Note that (ab)j(ab)^j is a periodic word and (ab)ia(ab)^ia is a palindrome for each i≤ji\leq j. These results justify the following question: What is the palindromic length of a concatenation of a suffix of bb and a periodic word (ab)j(ab)^j with "many" periodic palindromes? It is known that ∣PL(uv)−PL(u)∣≤PL(v)\lvert\text{PL}(uv)-\text{PL}(u)\rvert\leq \text{PL}(v), where uu and vv are nonempty words. The main result of our article shows that if a,ba,b are palindromes, bb is nonempty, uu is a nonempty suffix of bb, ∣ab∣\vert ab\vert is the minimal period of abaaba, and jj is a positive integer with j≥3PL(u)j\geq3\text{PL}(u) then PL(u(ab)j)−PL(u)≥0\text{PL}(u(ab)^j)-\text{PL}(u)\geq 0

    Minimal Forbidden Factors of Circular Words

    Full text link
    Minimal forbidden factors are a useful tool for investigating properties of words and languages. Two factorial languages are distinct if and only if they have different (antifactorial) sets of minimal forbidden factors. There exist algorithms for computing the minimal forbidden factors of a word, as well as of a regular factorial language. Conversely, Crochemore et al. [IPL, 1998] gave an algorithm that, given the trie recognizing a finite antifactorial language MM, computes a DFA recognizing the language whose set of minimal forbidden factors is MM. In the same paper, they showed that the obtained DFA is minimal if the input trie recognizes the minimal forbidden factors of a single word. We generalize this result to the case of a circular word. We discuss several combinatorial properties of the minimal forbidden factors of a circular word. As a byproduct, we obtain a formal definition of the factor automaton of a circular word. Finally, we investigate the case of minimal forbidden factors of the circular Fibonacci words.Comment: To appear in Theoretical Computer Scienc

    A Characterization of Bispecial Sturmian Words

    Full text link
    A finite Sturmian word w over the alphabet {a,b} is left special (resp. right special) if aw and bw (resp. wa and wb) are both Sturmian words. A bispecial Sturmian word is a Sturmian word that is both left and right special. We show as a main result that bispecial Sturmian words are exactly the maximal internal factors of Christoffel words, that are words coding the digital approximations of segments in the Euclidean plane. This result is an extension of the known relation between central words and primitive Christoffel words. Our characterization allows us to give an enumerative formula for bispecial Sturmian words. We also investigate the minimal forbidden words for the set of Sturmian words.Comment: Accepted to MFCS 201

    Timing of Millisecond Pulsars in NGC 6752: Evidence for a High Mass-to-Light Ratio in the Cluster Core

    Get PDF
    Using pulse timing observations we have obtained precise parameters, including positions with about 20 mas accuracy, of five millisecond pulsars in NGC 6752. Three of them, located relatively close to the cluster center, have line-of-sight accelerations larger than the maximum value predicted by the central mass density derived from optical observation, providing dynamical evidence for a central mass-to-light ratio >~ 10, much higher than for any other globular cluster. It is likely that the other two millisecond pulsars have been ejected out of the core to their present locations at 1.4 and 3.3 half-mass radii, respectively, suggesting unusual non-thermal dynamics in the cluster core.Comment: Accepted by ApJ Letter. 5 pages, 2 figures, 1 tabl

    Palindromic Decompositions with Gaps and Errors

    Full text link
    Identifying palindromes in sequences has been an interesting line of research in combinatorics on words and also in computational biology, after the discovery of the relation of palindromes in the DNA sequence with the HIV virus. Efficient algorithms for the factorization of sequences into palindromes and maximal palindromes have been devised in recent years. We extend these studies by allowing gaps in decompositions and errors in palindromes, and also imposing a lower bound to the length of acceptable palindromes. We first present an algorithm for obtaining a palindromic decomposition of a string of length n with the minimal total gap length in time O(n log n * g) and space O(n g), where g is the number of allowed gaps in the decomposition. We then consider a decomposition of the string in maximal \delta-palindromes (i.e. palindromes with \delta errors under the edit or Hamming distance) and g allowed gaps. We present an algorithm to obtain such a decomposition with the minimal total gap length in time O(n (g + \delta)) and space O(n g).Comment: accepted to CSR 201

    Words with the Maximum Number of Abelian Squares

    Full text link
    An abelian square is the concatenation of two words that are anagrams of one another. A word of length nn can contain Θ(n2)\Theta(n^2) distinct factors that are abelian squares. We study infinite words such that the number of abelian square factors of length nn grows quadratically with nn.Comment: To appear in the proceedings of WORDS 201
    • …
    corecore