15 research outputs found
Dominance of Deleterious Alleles Controls the Response to a Population Bottleneck
<div><p>Population bottlenecks followed by re-expansions have been common throughout history of many populations. The response of alleles under selection to such demographic perturbations has been a subject of great interest in population genetics. On the basis of theoretical analysis and computer simulations, we suggest that this response qualitatively depends on dominance. The number of dominant or additive deleterious alleles per haploid genome is expected to be slightly increased following the bottleneck and re-expansion. In contrast, the number of completely or partially recessive alleles should be sharply reduced. Changes of population size expose differences between recessive and additive selection, potentially providing insight into the prevalence of dominance in natural populations. Specifically, we use a simple statistic, , where <i>x</i><sub><i>i</i></sub> represents the derived allele frequency, to compare the number of mutations in different populations, and detail its functional dependence on the strength of selection and the intensity of the population bottleneck. We also provide empirical evidence showing that gene sets associated with autosomal recessive disease in humans may have a <i>B</i><sub><i>R</i></sub> indicative of recessive selection. Together, these theoretical predictions and empirical observations show that complex demographic history may facilitate rather than impede inference of parameters of natural selection.</p></div
Comparisons of analytic and simulation results.
<p>Maximum response values of the burden ratio <i>B</i><sub><i>R</i></sub>(<i>t</i><sub><i>min</i></sub>) are plotted for recessive selection as a function of bottleneck intensity. A wide range of parameter sets is plotted with all combinations of 2<i>N</i><sub><i>B</i></sub> = {2000,1000,400,200,100}, <i>s</i> = {0.1,0.02,0.01,0.001}, <i>T</i><sub><i>B</i></sub> = {200,100,50,20,10}, each simulated for 10<sup>8</sup> nucleotide sites. For relatively low intensity bottlenecks we note excellent agreement over the parameter ranges plotted. Intensities with <i>I</i><sub><i>B</i></sub> = <i>T</i><sub><i>B</i></sub>/2<i>N</i><sub><i>B</i></sub> > 0.1 are excluded, as the single-generation bottleneck scaling breaks down in favor of a long bottleneck scaling. The approximation necessarily weakens for simulations that represent longer bottlenecks, and only for strong selective coefficients, as expected. This quantifies the limitations of the single-generation bottleneck approximation, as we observe substantial deviation only around <i>I</i><sub><i>B</i></sub> = 0.1 and with selection strength <i>s</i> = 0.1.</p
Empirical results for autosomal recessive disease gene sets.
<p>Empirical results for autosomal recessive disease gene sets.</p
Time dependence of the <i>B</i><sub><i>R</i></sub> statistic after re-expansion.
<p>The time dependence of <i>B</i><sub><i>R</i></sub>(<i>t</i>) after a population bottleneck is shown for for alleles under recessive selection (<i>h</i> = 0) for various selection strengths. Peak <i>B</i><sub><i>R</i></sub> values vary in both magnitude and time as a function of <i>s</i>. The founded population was simulated with 2<i>N</i><sub>0</sub> = 20000, 2<i>N</i><sub><i>B</i></sub> = 2000, and <i>T</i><sub><i>B</i></sub> = 200 and plotted for 5000 generations after re-expansion.</p
The <i>B</i><sub><i>R</i></sub> statistic at the time of observation.
<p><b>ABOVE:</b> At the time of observation <i>t</i><sub><i>obs</i></sub>, the value of <i>B</i><sub><i>R</i></sub>(<i>t</i><sub><i>obs</i></sub>) is plotted as a function of the average strength of selection <i>s</i> and dominance coefficient <i>h</i>. Dominance coefficients appear as solid lines with fully recessive selection (<i>h</i> = 0) at the top and purely additive selection () at the bottom. For strong selection <i>B</i><sub><i>R</i></sub> → 1 due to the rapid transient response. For weak selection <i>B</i><sub><i>R</i></sub> → 1 due to the nearly neutral insensitivity to the bottleneck. For some intermediate dominance coefficient <i>h</i><sub><i>c</i></sub>, a critical value occurs (<i>h</i><sub><i>c</i></sub> ∼ 0.25 in the example shown, but explored more generally in <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1005436#pgen.1005436.s001" target="_blank">S1 Text</a>) where additive and recessive effects cancel, yielding <i>B</i><sub><i>R</i></sub>(<i>h</i><sub><i>c</i></sub>) ∼ 1. A low intensity bottleneck (<i>I</i><sub><i>B</i></sub> = 0.05) is shown, with parameters 2<i>N</i><sub>0</sub> = 20000, 2<i>N</i><sub><i>B</i></sub> = 2000, <i>T</i><sub><i>B</i></sub> = 100, and <i>t</i><sub><i>obs</i></sub> = 1000. <b>BELOW:</b> The same range of parameters is plotted for a realistic demographic model of the Out of Africa event comparing Africans and Europeans [<a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1005436#pgen.1005436.ref048" target="_blank">48</a>], where <i>B</i><sub><i>R</i></sub> = 〈<i>x</i>〉<sub><i>African</i></sub>/〈<i>x</i>〉<sub><i>European</i></sub>. The European bottleneck has estimated intensity <i>I</i><sub><i>B</i></sub> ∼ (0.5), an order of magnitude stronger than the simple bottleneck above, allowing for potentially observable deviations from <i>B</i><sub><i>R</i></sub> ∼ 1 if a large fraction of analyzed variants act recessively with <i>h</i> < <i>h</i><sub><i>c</i></sub> ∼ 0.25.</p
Response of the <i>B</i><sub><i>R</i></sub> statistic for additive and recessive variation.
<p>A schematic representation of two populations is presented above (<b>A</b>). Initially a single population prior to the bottleneck event, the populations split and have distinct demographic profiles. The equilibrium population maintains a constant size for easy comparison to the founded population. The latter drastically reduces its population size to <i>N</i><sub><i>B</i></sub> for a short time <i>T</i><sub><i>B</i></sub> during the founder’s event. Our statistical comparison between populations is represented here for cases of purely additive (<b>B</b>) and purely recessive (<b>C</b>) variation. The statistic <i>B</i><sub><i>R</i></sub> > 1 for recessive variation (dominance coefficient <i>h</i> = 0) and <i>B</i><sub><i>R</i></sub> < 1 for additive variation (<i>h</i> = 1/2), providing a simple indicator for the primary mode of selection of polymorphic alleles in the populations.</p
Cartoon presentation of the NC statistic.
<p>The NC statistic aims to capture the length of the haplotype carrying a variant. For a given variant (called the index variant, shown in the middle of the figure), the value of the NC statistic is the base-10 logarithm of the sum of physical distances measured up-stream (5′ direction) and down-stream (3′ direction) from the index variant to the closest variant that is either beyond a recombination spot (example shown on the left) or is linked to the index variant but is rarer than the index variant (example shown on the right). The red arrow in the figure illustrates that sum of the two distances.</p
Discrimination of derived missense alleles by the NC statistic.
<p>Missense alleles are sub-classified info categories based on <i>PolyPhen-2</i> predictions. Effect sizes were calculated as standard deviations from the mean of the NC statistic for synonymous variants at the same minor allele count (MAC). Within each MAC class, P-values were calculated by 1-sided Mann-Whitney test. Combined P-values for MAC 2–6 were computed by meta-analysis (Methods).</p
Empirical Cumulative Distribution Function of the NC statistic for alleles at minor allele count 3 in GoNL data.
<p>Synonymous derived variants serve as the baseline distribution. The distribution of NC for probably damaging derived missense variants is notably shifted towards higher values, consistent with their younger age. The NC-statistic distribution for ancestral alleles are at minor allele count 3 is strongly shifted towards lower values, consistent with much older age of those alleles.</p
Simulation and theoretical results for allelic age and sojourn times.
<p>a. Example trajectories for a neutral and deleterious allele with current population frequencies 3% (indicated by the arrow). The shaded areas indicate sojourn times at frequencies above 5%. b. Mean ages for neutral and deleterious alleles at a given population frequency (lines show theoretical predictions, dots show simulation results with standard error bars). Simulation results are averages of alleles in a frequency range, while theoretical prediction are for alleles at a fixed frequency. The graph shows that deleterious alleles at a given frequency are younger than neutral alleles, and that the effect is greater for more strongly selected alleles. c. Mean sojourn times for neutral and deleterious alleles. Vertical line denotes the current population frequency of the variant (3%). Mean sojourn times have been computed in bins of 1%. Line connects theoretical predictions for each frequency bin. Dots show simulation results. The graph illustrates that deleterious alleles spend much less time than neutral alleles at higher population frequencies in the past even if they have the same current frequency.</p