Dynamical Systems in the Analysis of Biological Sequences

Abstract

The Chaos Game Representation (CGR) maps a sequence of letters taken from a finite alphabet onto the unit square in R2R^2. While it is a popular tool, few mathematical results have been proved to date. In this report, we show that the CGR gives rise to a limit measure, assuming only the input sequence is stationary ergodic. Some more precise properties are given in the i.i.d. and Markov cases. A new family of statistical tests to characterize the randomness of the inputs is proposed and analyzed. Finally, some basic properties of the CGR are used to generalize the notion of genomic signatur

    Similar works