164 research outputs found
On the Impact of Morphisms on BWT-Runs
Morphisms are widely studied combinatorial objects that can be used for generating infinite families of words. In the context of Information theory, injective morphisms are called (variable length) codes. In Data compression, the morphisms, combined with parsing techniques, have been recently used to define new mechanisms to generate repetitive words. Here, we show that the repetitiveness induced by applying a morphism to a word can be captured by a compression scheme based on the Burrows-Wheeler Transform (BWT). In fact, we prove that, differently from other compression-based repetitiveness measures, the measure r_bwt (which counts the number of equal-letter runs produced by applying BWT to a word) strongly depends on the applied morphism. More in detail, we characterize the binary morphisms that preserve the value of r_bwt(w), when applied to any binary word w containing both letters. They are precisely the Sturmian morphisms, which are well-known objects in Combinatorics on words. Moreover, we prove that it is always possible to find a binary morphism that, when applied to any binary word containing both letters, increases the number of BWT-equal letter runs by a given (even) number. In addition, we derive a method for constructing arbitrarily large families of binary words on which BWT produces a given (even) number of new equal-letter runs. Such results are obtained by using a new class of morphisms that we call Thue-Morse-like. Finally, we show that there exist binary morphisms μ for which it is possible to find words w such that the difference r_bwt(μ(w))-r_bwt(w) is arbitrarily large
A Characterization of Infinite LSP Words
G. Fici proved that a finite word has a minimal suffix automaton if and only
if all its left special factors occur as prefixes. He called LSP all finite and
infinite words having this latter property. We characterize here infinite LSP
words in terms of -adicity. More precisely we provide a finite set of
morphisms and an automaton such that an infinite word is LSP if
and only if it is -adic and all its directive words are recognizable by
Palindromic Length of Words with Many Periodic Palindromes
The palindromic length of a finite word is the minimal
number of palindromes whose concatenation is equal to . In 2013, Frid,
Puzynina, and Zamboni conjectured that: If is an infinite word and is
an integer such that for every factor of then
is ultimately periodic.
Suppose that is an infinite word and is an integer such
for every factor of . Let be the set
of all factors of that have more than
palindromic prefixes. We show that is an infinite set and we show
that for each positive integer there are palindromes and a word such that is a factor of and is nonempty. Note
that is a periodic word and is a palindrome for each . These results justify the following question: What is the palindromic
length of a concatenation of a suffix of and a periodic word with
"many" periodic palindromes?
It is known that ,
where and are nonempty words. The main result of our article shows that
if are palindromes, is nonempty, is a nonempty suffix of ,
is the minimal period of , and is a positive integer
with then
Minimal Forbidden Factors of Circular Words
Minimal forbidden factors are a useful tool for investigating properties of
words and languages. Two factorial languages are distinct if and only if they
have different (antifactorial) sets of minimal forbidden factors. There exist
algorithms for computing the minimal forbidden factors of a word, as well as of
a regular factorial language. Conversely, Crochemore et al. [IPL, 1998] gave an
algorithm that, given the trie recognizing a finite antifactorial language ,
computes a DFA recognizing the language whose set of minimal forbidden factors
is . In the same paper, they showed that the obtained DFA is minimal if the
input trie recognizes the minimal forbidden factors of a single word. We
generalize this result to the case of a circular word. We discuss several
combinatorial properties of the minimal forbidden factors of a circular word.
As a byproduct, we obtain a formal definition of the factor automaton of a
circular word. Finally, we investigate the case of minimal forbidden factors of
the circular Fibonacci words.Comment: To appear in Theoretical Computer Scienc
A Characterization of Bispecial Sturmian Words
A finite Sturmian word w over the alphabet {a,b} is left special (resp. right
special) if aw and bw (resp. wa and wb) are both Sturmian words. A bispecial
Sturmian word is a Sturmian word that is both left and right special. We show
as a main result that bispecial Sturmian words are exactly the maximal internal
factors of Christoffel words, that are words coding the digital approximations
of segments in the Euclidean plane. This result is an extension of the known
relation between central words and primitive Christoffel words. Our
characterization allows us to give an enumerative formula for bispecial
Sturmian words. We also investigate the minimal forbidden words for the set of
Sturmian words.Comment: Accepted to MFCS 201
Timing of Millisecond Pulsars in NGC 6752: Evidence for a High Mass-to-Light Ratio in the Cluster Core
Using pulse timing observations we have obtained precise parameters,
including positions with about 20 mas accuracy, of five millisecond pulsars in
NGC 6752. Three of them, located relatively close to the cluster center, have
line-of-sight accelerations larger than the maximum value predicted by the
central mass density derived from optical observation, providing dynamical
evidence for a central mass-to-light ratio >~ 10, much higher than for any
other globular cluster. It is likely that the other two millisecond pulsars
have been ejected out of the core to their present locations at 1.4 and 3.3
half-mass radii, respectively, suggesting unusual non-thermal dynamics in the
cluster core.Comment: Accepted by ApJ Letter. 5 pages, 2 figures, 1 tabl
Palindromic Decompositions with Gaps and Errors
Identifying palindromes in sequences has been an interesting line of research
in combinatorics on words and also in computational biology, after the
discovery of the relation of palindromes in the DNA sequence with the HIV
virus. Efficient algorithms for the factorization of sequences into palindromes
and maximal palindromes have been devised in recent years. We extend these
studies by allowing gaps in decompositions and errors in palindromes, and also
imposing a lower bound to the length of acceptable palindromes.
We first present an algorithm for obtaining a palindromic decomposition of a
string of length n with the minimal total gap length in time O(n log n * g) and
space O(n g), where g is the number of allowed gaps in the decomposition. We
then consider a decomposition of the string in maximal \delta-palindromes (i.e.
palindromes with \delta errors under the edit or Hamming distance) and g
allowed gaps. We present an algorithm to obtain such a decomposition with the
minimal total gap length in time O(n (g + \delta)) and space O(n g).Comment: accepted to CSR 201
Words with the Maximum Number of Abelian Squares
An abelian square is the concatenation of two words that are anagrams of one
another. A word of length can contain distinct factors that
are abelian squares. We study infinite words such that the number of abelian
square factors of length grows quadratically with .Comment: To appear in the proceedings of WORDS 201
- …