18,946 research outputs found
Formal inverses of the generalized Thue-Morse sequences and variations of the Rudin-Shapiro sequence
A formal inverse of a given automatic sequence (the sequence of coefficients
of the composition inverse of its associated formal power series) is also
automatic. The comparison of properties of the original sequence and its formal
inverse is an interesting problem. Such an analysis has been done before for
the Thue{Morse sequence. In this paper, we describe arithmetic properties of
formal inverses of the generalized Thue-Morse sequences and formal inverses of
two modifications of the Rudin{Shapiro sequence. In each case, we give the
recurrence relations and the automaton, then we analyze the lengths of strings
of consecutive identical letters as well as the frequencies of letters. We also
compare the obtained results with the original sequences.Comment: 20 page
Multidimensional Generalized Automatic Sequences and Shape-symmetric Morphic Words
An infinite word is S-automatic if, for all n>=0, its (n + 1)st letter is the
output of a deterministic automaton fed with the representation of n in the
considered numeration system S. In this extended abstract, we consider an
analogous definition in a multidimensional setting and present the connection
to the shape-symmetric infinite words introduced by Arnaud Maes. More
precisely, for d>=2, we state that a multidimensional infinite word x : N^d \to
\Sigma over a finite alphabet \Sigma is S-automatic for some abstract
numeration system S built on a regular language containing the empty word if
and only if x is the image by a coding of a shape-symmetric infinite word
Quasicrystals, model sets, and automatic sequences
We survey mathematical properties of quasicrystals, first from the point of
view of harmonic analysis, then from the point of view of morphic and automatic
sequences.
Nous proposons un tour d'horizon de propri\'et\'es math\'ematiques des
quasicristaux, d'abord du point de vue de l'analyse harmonique, ensuite du
point de vue des suites morphiques et automatiques
Cognitive scale-free networks as a model for intermittency in human natural language
We model certain features of human language complexity by means of advanced
concepts borrowed from statistical mechanics. Using a time series approach, the
diffusion entropy method (DE), we compute the complexity of an Italian corpus
of newspapers and magazines. We find that the anomalous scaling index is
compatible with a simple dynamical model, a random walk on a complex scale-free
network, which is linguistically related to Saussurre's paradigms. The model
yields the famous Zipf's law in terms of the generalized central limit theorem.Comment: Conference FRACTAL 200
On Hilberg's Law and Its Links with Guiraud's Law
Hilberg (1990) supposed that finite-order excess entropy of a random human
text is proportional to the square root of the text length. Assuming that
Hilberg's hypothesis is true, we derive Guiraud's law, which states that the
number of word types in a text is greater than proportional to the square root
of the text length. Our derivation is based on some mathematical conjecture in
coding theory and on several experiments suggesting that words can be defined
approximately as the nonterminals of the shortest context-free grammar for the
text. Such operational definition of words can be applied even to texts
deprived of spaces, which do not allow for Mandelbrot's ``intermittent
silence'' explanation of Zipf's and Guiraud's laws. In contrast to
Mandelbrot's, our model assumes some probabilistic long-memory effects in human
narration and might be capable of explaining Menzerath's law.Comment: To appear in Journal of Quantitative Linguistic
Can simple models explain Zipfâs law for all exponents?
H. Simon proposed a simple stochastic process for explaining Zipfâs law for word frequencies. Here we introduce two similar generalizations of Simonâs model that cover the same range of exponents as the standard Simon model. The mathematical approach followed minimizes the
amount of mathematical background needed for deriving the exponent, compared to previous approaches to the standard Simonâs model. Reviewing what is known from other simple explanations of Zipfâs law, we conclude there is no single radically simple explanation covering the whole range of variation of the exponent of Zipfâs law in humans. The meaningfulness of Zipfâs law for word frequencies remains an open question.Peer ReviewedPostprint (published version
Automated Detection of Usage Errors in non-native English Writing
In an investigation of the use of a novelty detection algorithm for identifying inappropriate word
combinations in a raw English corpus, we employ an
unsupervised detection algorithm based on the one-
class support vector machines (OC-SVMs) and extract
sentences containing word sequences whose frequency
of appearance is significantly low in native English
writing. Combined with n-gram language models and
document categorization techniques, the OC-SVM classifier assigns given sentences into two different
groups; the sentences containing errors and those
without errors. Accuracies are 79.30 % with bigram
model, 86.63 % with trigram model, and 34.34 % with four-gram model
On winning shifts of marked uniform substitutions
The second author introduced with I. T\"orm\"a a two-player word-building
game [Playing with Subshifts, Fund. Inform. 132 (2014), 131--152]. The game has
a predetermined (possibly finite) choice sequence , ,
of integers such that on round the player chooses a subset
of size of some fixed finite alphabet and the player picks
a letter from the set . The outcome is determined by whether the word
obtained by concatenating the letters picked lies in a prescribed target
set (a win for player ) or not (a win for player ). Typically, we
consider to be a subshift. The winning shift of a subshift is
defined as the set of choice sequences for which has a winning strategy
when the target set is the language of . The winning shift mirrors
some properties of . For instance, and have the same entropy.
Virtually nothing is known about the structure of the winning shifts of
subshifts common in combinatorics on words. In this paper, we study the winning
shifts of subshifts generated by marked uniform substitutions, and show that
these winning shifts, viewed as subshifts, also have a substitutive structure.
Particularly, we give an explicit description of the winning shift for the
generalized Thue-Morse substitutions. It is known that and have the
same factor complexity. As an example application, we exploit this connection
to give a simple derivation of the first difference and factor complexity
functions of subshifts generated by marked substitutions. We describe these
functions in particular detail for the generalized Thue-Morse substitutions.Comment: Extended version of a paper presented at RuFiDiM I
- âŠ