24 research outputs found
Digital search trees and chaos game representation
In this paper, we consider a possible representation of a DNA sequence in a
quaternary tree, in which on can visualize repetitions of subwords. The
CGR-tree turns a sequence of letters into a digital search tree (DST), obtained
from the suffixes of the reversed sequence. Several results are known
concerning the height and the insertion depth for DST built from i.i.d.
successive sequences. Here, the successive inserted wors are strongly
dependent. We give the asymptotic behaviour of the insertion depth and of the
length of branches for the CGR-tree obtained from the suffixes of reversed
i.i.d. or Markovian sequence. This behaviour turns out to be at first order the
same one as in the case of independent words. As a by-product, asymptotic
results on the length of longest runs in a Markovian sequence are obtained
Variable length Markov chains and dynamical sources
Infinite random sequences of letters can be viewed as stochastic chains or as
strings produced by a source, in the sense of information theory. The
relationship between Variable Length Markov Chains (VLMC) and probabilistic
dynamical sources is studied. We establish a probabilistic frame for context
trees and VLMC and we prove that any VLMC is a dynamical source for which we
explicitly build the mapping. On two examples, the ``comb'' and the ``bamboo
blossom'', we find a necessary and sufficient condition for the existence and
the unicity of a stationary probability measure for the VLMC. These two
examples are detailed in order to provide the associated Dirichlet series as
well as the generating functions of word occurrences.Comment: 45 pages, 15 figure
Digital search trees and chaos game representation
Version préliminaire (2006) d'un travail publié sous forme définitive (2009).International audienceIn this paper, we consider a possible representation of a DNA sequence in a quaternary tree, in which on can visualize repetitions of subwords. The CGR-tree turns a sequence of letters into a digital search tree (DST), obtained from the suffixes of the reversed sequence. Several results are known concerning the height and the insertion depth for DST built from i.i.d. successive sequences. Here, the successive inserted wors are strongly dependent. We give the asymptotic behaviour of the insertion depth and of the length of branches for the CGR-tree obtained from the suffixes of reversed i.i.d. or Markovian sequence. This behaviour turns out to be at first order the same one as in the case of independent words. As a by-product, asymptotic results on the length of longest runs in a Markovian sequence are obtained
Characterization of stationary probability measures for Variable Length Markov Chains
By introducing a key combinatorial structure for words produced by a Variable Length Markov Chain (VLMC), the longest internal suffix, precise characterizations of existence and uniqueness of a stationary probability measure for a VLMC chain are given. These characterizations turn into necessary and sufficient conditions for VLMC associated to a subclass of probabilised context trees: the shift-stable context trees. As a by-product, we prove that a VLMC chain whose stabilized context tree is again a context tree has at most one stationary probability measure. MSC 2010: 60J05, 60C05, 60G10
Characterization of stationary probability measures for Variable Length Markov Chains
32 pagesBy introducing a key combinatorial structure for words produced by a Variable Length Markov Chain (VLMC), the longest internal suffix, precise characterizations of existence and uniqueness of a stationary probability measure for a VLMC chain are given. These characterizations turn into necessary and sufficient conditions for VLMC associated to a subclass of probabilised context trees: the shift-stable context trees. As a by-product, we prove that a VLMC chain whose stabilized context tree is again a context tree has at most one stationary probability measure
Context trees, variable length Markov chains and dynamical sources.
Infinite random sequences of letters can be viewed as stochastic chains or as strings produced by a source, in the sense of information theory. The relationship between Variable Length Markov Chains (VLMC) and probabilistic dynamical sources is studied. We establish a probabilistic frame for context trees and VLMC and we prove that any VLMC is a dynamical source for which we explicitly build the mapping. On two examples, the "comb" and the "bamboo blossom", we find a necessary and sufficient condition for the existence and the uniqueness of a stationary probability measure for the VLMC. These two examples are detailed in order to provide the associated Dirichlet series as well as the generating functions of word occurrences
Variable length Markov chains and dynamical sources
Infinite random sequences of letters can be viewed as stochastic chains or as strings produced by a source, in the sense of information theory. The relationship between Variable Length Markov Chains (VLMC) and probabilistic dynamical sources is studied. We establish a probabilistic frame for context trees and VLMC and we prove that any VLMC is a dynamical source for which we explicitly build the mapping. On two examples, the ``comb'' and the ``bamboo blossom'', we find a necessary and sufficient condition for the existence and the unicity of a stationary probability measure for the VLMC. These two examples are detailed in order to provide the associated Dirichlet series as well as the generating functions of word occurrences
Uncommon Suffix Tries
Common assumptions on the source producing the words inserted in a suffix trie with leaves lead to a height and saturation level. We provide an example of a suffix trie whose height increases faster than a power of and another one whose saturation level is negligible with respect to . Both are built from VLMC (Variable Length Markov Chain) probabilistic sources; they are easily extended to families of sources having the same properties. The first example corresponds to a ''logarithmic infinite comb'' and enjoys a non uniform polynomial mixing. The second one corresponds to a ''factorial infinite comb'' for which mixing is uniform and exponential