18,946 research outputs found

    Formal inverses of the generalized Thue-Morse sequences and variations of the Rudin-Shapiro sequence

    Get PDF
    A formal inverse of a given automatic sequence (the sequence of coefficients of the composition inverse of its associated formal power series) is also automatic. The comparison of properties of the original sequence and its formal inverse is an interesting problem. Such an analysis has been done before for the Thue{Morse sequence. In this paper, we describe arithmetic properties of formal inverses of the generalized Thue-Morse sequences and formal inverses of two modifications of the Rudin{Shapiro sequence. In each case, we give the recurrence relations and the automaton, then we analyze the lengths of strings of consecutive identical letters as well as the frequencies of letters. We also compare the obtained results with the original sequences.Comment: 20 page

    Multidimensional Generalized Automatic Sequences and Shape-symmetric Morphic Words

    Get PDF
    An infinite word is S-automatic if, for all n>=0, its (n + 1)st letter is the output of a deterministic automaton fed with the representation of n in the considered numeration system S. In this extended abstract, we consider an analogous definition in a multidimensional setting and present the connection to the shape-symmetric infinite words introduced by Arnaud Maes. More precisely, for d>=2, we state that a multidimensional infinite word x : N^d \to \Sigma over a finite alphabet \Sigma is S-automatic for some abstract numeration system S built on a regular language containing the empty word if and only if x is the image by a coding of a shape-symmetric infinite word

    Quasicrystals, model sets, and automatic sequences

    Get PDF
    We survey mathematical properties of quasicrystals, first from the point of view of harmonic analysis, then from the point of view of morphic and automatic sequences. Nous proposons un tour d'horizon de propri\'et\'es math\'ematiques des quasicristaux, d'abord du point de vue de l'analyse harmonique, ensuite du point de vue des suites morphiques et automatiques

    Cognitive scale-free networks as a model for intermittency in human natural language

    Full text link
    We model certain features of human language complexity by means of advanced concepts borrowed from statistical mechanics. Using a time series approach, the diffusion entropy method (DE), we compute the complexity of an Italian corpus of newspapers and magazines. We find that the anomalous scaling index is compatible with a simple dynamical model, a random walk on a complex scale-free network, which is linguistically related to Saussurre's paradigms. The model yields the famous Zipf's law in terms of the generalized central limit theorem.Comment: Conference FRACTAL 200

    On Hilberg's Law and Its Links with Guiraud's Law

    Full text link
    Hilberg (1990) supposed that finite-order excess entropy of a random human text is proportional to the square root of the text length. Assuming that Hilberg's hypothesis is true, we derive Guiraud's law, which states that the number of word types in a text is greater than proportional to the square root of the text length. Our derivation is based on some mathematical conjecture in coding theory and on several experiments suggesting that words can be defined approximately as the nonterminals of the shortest context-free grammar for the text. Such operational definition of words can be applied even to texts deprived of spaces, which do not allow for Mandelbrot's ``intermittent silence'' explanation of Zipf's and Guiraud's laws. In contrast to Mandelbrot's, our model assumes some probabilistic long-memory effects in human narration and might be capable of explaining Menzerath's law.Comment: To appear in Journal of Quantitative Linguistic

    Can simple models explain Zipf’s law for all exponents?

    Get PDF
    H. Simon proposed a simple stochastic process for explaining Zipf’s law for word frequencies. Here we introduce two similar generalizations of Simon’s model that cover the same range of exponents as the standard Simon model. The mathematical approach followed minimizes the amount of mathematical background needed for deriving the exponent, compared to previous approaches to the standard Simon’s model. Reviewing what is known from other simple explanations of Zipf’s law, we conclude there is no single radically simple explanation covering the whole range of variation of the exponent of Zipf’s law in humans. The meaningfulness of Zipf’s law for word frequencies remains an open question.Peer ReviewedPostprint (published version

    Automated Detection of Usage Errors in non-native English Writing

    Get PDF
    In an investigation of the use of a novelty detection algorithm for identifying inappropriate word combinations in a raw English corpus, we employ an unsupervised detection algorithm based on the one- class support vector machines (OC-SVMs) and extract sentences containing word sequences whose frequency of appearance is significantly low in native English writing. Combined with n-gram language models and document categorization techniques, the OC-SVM classifier assigns given sentences into two different groups; the sentences containing errors and those without errors. Accuracies are 79.30 % with bigram model, 86.63 % with trigram model, and 34.34 % with four-gram model

    On winning shifts of marked uniform substitutions

    Full text link
    The second author introduced with I. T\"orm\"a a two-player word-building game [Playing with Subshifts, Fund. Inform. 132 (2014), 131--152]. The game has a predetermined (possibly finite) choice sequence α1\alpha_1, α2\alpha_2, 
\ldots of integers such that on round nn the player AA chooses a subset SnS_n of size αn\alpha_n of some fixed finite alphabet and the player BB picks a letter from the set SnS_n. The outcome is determined by whether the word obtained by concatenating the letters BB picked lies in a prescribed target set XX (a win for player AA) or not (a win for player BB). Typically, we consider XX to be a subshift. The winning shift W(X)W(X) of a subshift XX is defined as the set of choice sequences for which AA has a winning strategy when the target set is the language of XX. The winning shift W(X)W(X) mirrors some properties of XX. For instance, W(X)W(X) and XX have the same entropy. Virtually nothing is known about the structure of the winning shifts of subshifts common in combinatorics on words. In this paper, we study the winning shifts of subshifts generated by marked uniform substitutions, and show that these winning shifts, viewed as subshifts, also have a substitutive structure. Particularly, we give an explicit description of the winning shift for the generalized Thue-Morse substitutions. It is known that W(X)W(X) and XX have the same factor complexity. As an example application, we exploit this connection to give a simple derivation of the first difference and factor complexity functions of subshifts generated by marked substitutions. We describe these functions in particular detail for the generalized Thue-Morse substitutions.Comment: Extended version of a paper presented at RuFiDiM I
    • 

    corecore