Complete archaeal genomes were probed for the presence of long
(≥ 25 bp) oligonucleotide repeats (words). We detected the
presence of many words distributed in tandem with narrow ranges of
periodicity (i.e., spacer length between repeats). Similar words were
not identified in genomes of non-archaeal species, namely
Escherichia coli, Bacillus subtilis,
Haemophilus influenzae, Mycoplasma
genitalium and Mycoplasma pneumoniae. BLAST
similarity searches against the GenBank nucleotide sequence database
revealed that these words were archaeal species-specific, indicating
that they are of a signature character. Sequence analysis and genome
viewing tools showed these repeats to be restricted to non-coding
regions. Thus, archaea appear to possess a non-coding genomic
signature that is absent in bacterial species. The identification of a
species-specific genomic signature would be of great value to archaeal
genome mapping, evolutionary studies and analyses of genome
complexity