This study presents the first global, 1 Mbp level analysis of patterns of
nucleotide substitutions along the human lineage. The study is based on the
analysis of a large amount of repetitive elements deposited into the human
genome since the mammalian radiation, yielding a number of results that would
have been difficult to obtain using the more conventional comparative method of
analysis. This analysis revealed substantial and consistent variability of
rates of substitution, with the variability ranging up to 2-fold among
different regions. The rates of substitutions of C or G nucleotides with A or T
nucleotides vary much more sharply than the reverse rates suggesting that much
of that variation is due to differences in mutation rates rather than in the
probabilities of fixation of C/G vs. A/T nucleotides across the genome. For all
types of substitution we observe substantially more hotspots than coldspots,
with hotspots showing substantial clustering over tens of Mbp's. Our analysis
revealed that GC-content of surrounding sequences is the best predictor of the
rates of substitution. The pattern of substitution appears very different near
telomeres compared to the rest of the genome and cannot be explained by the
genome-wide correlations of the substitution rates with GC content or exon
density. The telomere pattern of substitution is consistent with natural
selection or biased gene conversion acting to increase the GC-content of the
sequences that are within 10-15 Mbp away from the telomere.Comment: 35 pages, 6 figure