research

Characteristics of oligonucleotide frequencies across genomes: Conservation versus variation, strand symmetry, and evolutionary implications

Abstract

One of the objectives of evolutionary genomics is to reveal the genetic information contained in the primordial genome (called the primary genetic information in this paper, with the primordial genome defined here as the most primitive nucleic acid genome for earth’s life) by searching for primitive traits or relics remained in modern genomes. As the shorter a sequence is, the less probable it would be modified during genome evolution. For that reason, some characteristics of very short nucleotide sequences would have considerable chances to persist during billions of years of evolution. Consequently, conservation of certain genomic features of mononucleotides, dinucleotides, and higher-order oligonucleotides across various genomes may exist; some, if not all, of these features would be relics of the primary genetic information. Based on this assumption, we analyzed the pattern of frequencies of mononucleotides, dinucleotides, and higher-order oligonucleotides of the whole-genome sequences from 458 species (including archaea, bacteria, and eukaryotes). Also, we studied the phenomenon of strand symmetry in these genomes. The results show that the conservation of frequencies of some dinucleotides and higher-order oligonucleotides across genomes does exist, and that strand symmetry is a ubiquitous and explicit phenomenon that may contribute to frequency conservation. We propose a new hypothesis for the origin of strand symmetry and frequency conservation as well as for the constitution of early genomes. We conclude that the phenomena of strand symmetry and the pattern of frequency conservation would be original features of the primary genetic information

    Similar works