114 research outputs found

    Factor frequencies in generalized Thue-Morse words

    Get PDF
    We describe factor frequencies of the generalized Thue-Morse word t_{b,m} defined for integers b greater than 1, m greater than 0 as the fixed point starting in 0 of the morphism \phi_{b,m} given by \phi_{b,m}(k)=k(k+1)...(k+b-1), where k = 0,1,..., m-1 and where the letters are expressed modulo m. We use the result of A. Frid, On the frequency of factors in a D0L word, Journal of Automata, Languages and Combinatorics 3 (1998), 29-41 and the study of generalized Thue-Morse words by S. Starosta, Generalized Thue-Morse words and palindromic richness, arXiv:1104.2476v2 [math.CO].Comment: 11 page

    Morphic words and equidistributed sequences

    Full text link
    The problem we consider is the following: Given an infinite word ww on an ordered alphabet, construct the sequence νw=(ν[n])n\nu_w=(\nu[n])_n, equidistributed on [0,1][0,1] and such that ν[m]<ν[n]\nu[m]<\nu[n] if and only if σm(w)<σn(w)\sigma^m(w)<\sigma^n(w), where σ\sigma is the shift operation, erasing the first symbol of ww. The sequence νw\nu_w exists and is unique for every word with well-defined positive uniform frequencies of every factor, or, in dynamical terms, for every element of a uniquely ergodic subshift. In this paper we describe the construction of νw\nu_w for the case when the subshift of ww is generated by a morphism of a special kind; then we overcome some technical difficulties to extend the result to all binary morphisms. The sequence νw\nu_w in this case is also constructed with a morphism. At last, we introduce a software tool which, given a binary morphism φ\varphi, computes the morphism on extended intervals and first elements of the equidistributed sequences associated with fixed points of φ\varphi

    A Note on Symmetries in the Rauzy Graph and Factor Frequencies

    Get PDF
    We focus on infinite words with languages closed under reversal. If frequencies of all factors are well defined, we show that the number of different frequencies of factors of length n+1 does not exceed 2C(n+1)-2C(n)+1.Comment: 7 page

    Inferring Different Types of Lindenmayer Systems Using Artificial Intelligence

    Get PDF
    Lindenmayer systems (L-systems) are a formal grammar system which consist of a set of rewriting rules. Each rewriting rule is comprised of a symbol to replace (predecessor), a replacement string (successor), and an optional condition that is necessary for replacement. Starting with an initial string, every symbol in the string is replaced in parallel in accordance with the conditions on the rewriting rules, to produce a new string. The replacement process iterates as needed to produce a sequence of strings. There are different types of L-systems, which allow for different types of conditions, and methods of selecting the rules to apply. Some symbols of the alphabet can be interpreted as instructions for simulation software towards process modelling, where each string describes another step of the simulated process. Typically, creating an L-system for a specific process is done by experts by making meticulous measurements and using a priori knowledge about the process. It would be desirable to have a method to automatically learn the L-systems (the simulation program) from data, such as from a temporal sequence of images. This thesis presents a suite of tools, collectively called the Plant Model Inference Tools or PMIT (despite the name, the tools are domain agnostic), for inferring different types of L-systems using only a sequence of strings describing the process over some initial time period. Variants of PMIT are created for deterministic context-free L-systems, stochastic L-systems, and parametric L-systems. They are each evaluated using existing known deterministic and parametric L-systems from the literature, and procedurally generated stochastic L-systems. Accuracy can be detected in various ways, such as checking whether the inferred L-system is equal to the original one. PMIT is able to correctly infer deterministic L-systems with up to 31 symbols in the alphabet compared to the previous state-of-the-art algorithm's limit of 2 symbols. Stochastic L-systems allow symbols in the alphabet to have multiple rewriting rules each with an associated probability of being selected. Evaluating stochastic L-system inference with 960 procedurally generated L-systems with multiple sequences of strings as input found the following: 1) when 3 input sequences are used, the inferred successors always matched the original successors for systems with up to 9 rewriting rules, 2) when 6 sequences of strings are used, the difference between the associated probabilities of the inferred and the original L-system is approximately 1%. Parametric L-systems allow symbols to have multiple rewriting rules with parameters that get passed during rewriting. Rule selection is based on an associated Boolean condition over the parameters that gets evaluated to choose the rule to be applied. Inference is done in two steps. In the first step, the successors are inferred, and in the second step, appropriate Boolean conditions are found. Parametric L-system inference was evaluated on 20 known parametric L-systems. For 18 of the 20 L-systems where all successors were non-empty, the successors were correctly identified, but the time taken was up to 26 days on a single core CPU for the largest L-system. The second step, inferring the Boolean conditions, was successful for all 20 systems in the test set. No previous algorithm from the literature had implemented stochastic or parametric L-system inference. Inferring L-systems of greater complexity algorithmically can save considerable time and effort versus constructing them manually; however, perhaps more importantly rather than relying on existing knowledge, inferring a simulation of a process from data can help reveal the underlying scientific principles of the process

    Ten Conferences WORDS: Open Problems and Conjectures

    Full text link
    In connection to the development of the field of Combinatorics on Words, we present a list of open problems and conjectures that were stated during the ten last meetings WORDS. We wish to continually update the present document by adding informations concerning advances in problems solving

    Conferences WORDS, years 1997-2017: Open Problems and Conjectures

    Get PDF
    International audienceIn connection with the development of the field of Combinatorics on Words, we present a list of open problems and conjectures which were stated in the context of the eleven international meetings WORDS, which held from 1997 to 2017

    Fixed points avoiding Abelian kk-powers

    Get PDF
    We show that the problem of whether the fixed point of a morphism avoids Abelian kk-powers is decidable under rather general condition

    Factor Complexity of S-adic sequences generated by the Arnoux-Rauzy-Poincar\'e Algorithm

    Full text link
    The Arnoux-Rauzy-Poincar\'e multidimensional continued fraction algorithm is obtained by combining the Arnoux-Rauzy and Poincar\'e algorithms. It is a generalized Euclidean algorithm. Its three-dimensional linear version consists in subtracting the sum of the two smallest entries to the largest if possible (Arnoux-Rauzy step), and otherwise, in subtracting the smallest entry to the median and the median to the largest (the Poincar\'e step), and by performing when possible Arnoux-Rauzy steps in priority. After renormalization it provides a piecewise fractional map of the standard 22-simplex. We study here the factor complexity of its associated symbolic dynamical system, defined as an SS-adic system. It is made of infinite words generated by the composition of sequences of finitely many substitutions, together with some restrictions concerning the allowed sequences of substitutions expressed in terms of a regular language. Here, the substitutions are provided by the matrices of the linear version of the algorithm. We give an upper bound for the linear growth of the factor complexity. We then deduce the convergence of the associated algorithm by unique ergodicity.Comment: 36 pages, 16 figure

    Critical Exponents and Stabilizers of Infinite Words

    Get PDF
    This thesis concerns infinite words over finite alphabets. It contributes to two topics in this area: critical exponents and stabilizers. Let w be a right-infinite word defined over a finite alphabet. The critical exponent of w is the supremum of the set of exponents r such that w contains an r-power as a subword. Most of the thesis (Chapters 3 through 7) is devoted to critical exponents. Chapter 3 is a survey of previous research on critical exponents and repetitions in morphic words. In Chapter 4 we prove that every real number greater than 1 is the critical exponent of some right-infinite word over some finite alphabet. Our proof is constructive. In Chapter 5 we characterize critical exponents of pure morphic words generated by uniform binary morphisms. We also give an explicit formula to compute these critical exponents, based on a well-defined prefix of the infinite word. In Chapter 6 we generalize our results to pure morphic words generated by non-erasing morphisms over any finite alphabet. We prove that critical exponents of such words are algebraic, of a degree bounded by the alphabet size. Under certain conditions, our proof implies an algorithm for computing the critical exponent. We demonstrate our method by computing the critical exponent of some families of infinite words. In particular, in Chapter 7 we compute the critical exponent of the Arshon word of order n for n ≥ 3. The stabilizer of an infinite word w defined over a finite alphabet Σ is the set of morphisms f: Σ*→Σ* that fix w. In Chapter 8 we study various problems related to stabilizers and their generators. We show that over a binary alphabet, there exist stabilizers with at least n generators for all n. Over a ternary alphabet, the monoid of morphisms generating a given infinite word by iteration can be infinitely generated, even when the word is generated by iterating an invertible primitive morphism. Stabilizers of strict epistandard words are cyclic when non-trivial, while stabilizers of ultimately strict epistandard words are always non-trivial. For this latter family of words, we give a characterization of stabilizer elements. We conclude with a list of open problems, including a new problem that has not been addressed yet: the D0L repetition threshold
    corecore