25 research outputs found
Nucleosomes Shape DNA Polymorphism and Divergence
<div><p>An estimated 80% of genomic DNA in eukaryotes is packaged as nucleosomes, which, together with the remaining interstitial linker regions, generate higher order chromatin structures <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1004457#pgen.1004457-Lee1" target="_blank">[1]</a>. Nucleosome sequences isolated from diverse organisms exhibit ∼10 bp periodic variations in AA, TT and GC dinucleotide frequencies. These sequence elements generate intrinsically curved DNA and help establish the histone-DNA interface. We investigated an important unanswered question concerning the interplay between chromatin organization and genome evolution: do the DNA sequence preferences inherent to the highly conserved histone core exert detectable natural selection on genomic divergence and polymorphism? To address this hypothesis, we isolated nucleosomal DNA sequences from <i>Drosophila melanogaster</i> embryos and examined the underlying genomic variation within and between species. We found that divergence along the <i>D. melanogaster</i> lineage is periodic across nucleosome regions with base changes following preferred nucleotides, providing new evidence for systematic evolutionary forces in the generation and maintenance of nucleosome-associated dinucleotide periodicities. Further, Single Nucleotide Polymorphism (SNP) frequency spectra show striking periodicities across nucleosomal regions, paralleling divergence patterns. Preferred alleles occur at higher frequencies in natural populations, consistent with a central role for natural selection. These patterns are stronger for nucleosomes in introns than in intergenic regions, suggesting selection is stronger in transcribed regions where nucleosomes undergo more displacement, remodeling and functional modification. In addition, we observe a large-scale (∼180 bp) periodic enrichment of AA/TT dinucleotides associated with nucleosome occupancy, while GC dinucleotide frequency peaks in linker regions. Divergence and polymorphism data also support a role for natural selection in the generation and maintenance of these super-nucleosomal patterns. Our results demonstrate that nucleosome-associated sequence periodicities are under selective pressure, implying that structural interactions between nucleosomes and DNA sequence shape sequence evolution, particularly in introns.</p></div
Gene ontology enrichment analysis based on outlier windows for high mean <i>F<sub>ST</sub></i> for African population comparisons.
<p>Listed are GO categories with <i>P</i><0.05 and outlier genes >1. Full results are given in <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1003080#pgen.1003080.s024" target="_blank">Table S16</a>.</p
Locations of population samples from which the analyzed genomes were derived.
<p>Each population sample is indicated by a two letter abbreviation followed by the number of primary core genomes sequenced. For populations with secondary core genomes, that number follows a comma. Additional data and sample characteristics are described in <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1003080#pgen.1003080.s009" target="_blank">Table S1</a>.</p
Mean sequencing depth.
<p>Mean sequencing depth is correlated with genetic distance (A) and genomic coverage (B). African core genomes with data from all major chromosome arms are depicted. The effect of depth on genetic distance applies whether genomes are compared to the published reference genome (blue) or the Zambia ZI population sample (red). Subsequent analyses focused largely on “primary core” genomes with >25X depth.</p
The ratio of nucleotide diversity between non-African (France, FR) and African (Rwanda, RG) genomes.
<p>Each window contains 5000 RG non-singleton SNPs. Chromosome arms are labeled and indicated by color. Dashed series for the three arms with segregating inversions in the FR sample reflect diversity ratios for standard chromosomes only, indicating that inversions add significant diversity at the scale of whole chromosome arms.</p
Gene ontology enrichment analysis based on outlier windows for high <i>F<sub>ST</sub></i> between Rwanda and France population samples.
<p>Listed are GO categories with <i>P</i><0.05 and outlier genes >1. Full results are given in <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1003080#pgen.1003080.s026" target="_blank">Table S18</a>.</p
Nucleotide diversity and genetic differentiation are shown, averaged across the non-centromeric, non-telomeric regions of each chromosome arm.
<p>Values above the diagonal represent <i>D<sub>xy</sub></i> (in percent), while those below reflect <i>F<sub>ST</sub></i>. Bold values on the diagonal are <i>π</i> (%). The ratio of each population's genetic distance to the ZI sample versus diversity with the ZI sample is also given (bottom row). Ratios were corrected based on the (minor) predicted effects of sequencing depth for each population (see <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1003080#s4" target="_blank">Materials and Methods</a>). Ratios significantly greater than one (bootstrapping <i>P</i><0.001) are noted (*). Admixture-filtered data from genomes with less than 15% estimated admixture were analyzed for each population that had two or more such genomes.</p
Cosmopolitan admixture levels are depicted across the genome.
<p>For each genomic window, the number of African primary core genomes (across all populations) with >50% admixture probability is plotted. Chromosome arms are labeled and indicated by color. Each window contains 1000 RG non-singleton SNPs (approximately 50 kb on average).</p
Allele frequencies for the RG sample (using a sample size of 18) at short intron sites.
<p>(A) The folded frequency spectrum for each chromosome arm. (B) Comparison of the proportion of SNPs with a minor allele count of 1 in regions of lower versus higher recombination.</p
Linkage disequilibrium (LD), excluding singleton polymorphisms.
<p>Series refer to the observed LD for each major chromosome arm, and the expected LD from neutral equilibrium simulations for X-linked and autosomal loci, as given in panel A. (A) Average <i>r<sup>2</sup></i> for a series of SNP pair distance bins. (B) Average <i>r<sub>ω</sub></i> for SNP pairs with positive LD. (C) Average <i>r<sub>ω</sub></i> for SNP pairs with negative LD.</p