We perform a statistical study of the distances between successive
occurrencies of a given dinucleotide in the DNA sequence for a number of
organisms of different complexity. Our analysis highlights peculiar features of
the dinucleotide CG distribution in mammalian DNA, pointing towards a
connection with the role of such dinucleotide in DNA methylation. While the CG
distributions of mammals exhibit exponential tails with comparable parameters,
the picture for the other organisms studied (e.g., fish, insects, bacteria and
viruses) is more heterogeneous, possibly because in these organisms DNA
methylation has different functional roles. Our analysis suggests that the
distribution of the distances between dinucleotides CG provides useful insights
in characterizing and classifying organisms in terms of methylation
functionalities.Comment: 13 pages, 5 figures. To be published in the Philosophical
Transactions A theme issue "DNA as information