43 research outputs found

    Testing the Fit of the Second Eigenvalue

    No full text
    <p>We generated genotype data in which the leading eigenvalue is overwhelmingly significant (<i>F<sub>ST</sub></i> = .01, <i>m</i> = 100, <i>n</i> = 5,000) with two equal-sized subpopulations. We show P–P plots for the TW statistic computed from the <i>second</i> eigenvalue. The fit at the high end is excellent.</p

    Three East Asian Populations

    No full text
    <p>Plots of the first two eigenvectors for a population from Thailand and Chinese and Japanese populations from the International Haplotype Map [<a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.0020190#pgen-0020190-b032" target="_blank">32</a>]. The Japanese population is clearly distinguished (though not by either eigenvector separately). The large dispersal of the Thai population, along a line where the Chinese are at an extreme, suggests some gene flow of a Chinese-related population into Thailand. Note the similarity to the simulated data of <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.0020190#pgen-0020190-g008" target="_blank">Figure 8</a>.</p

    LD Correction with Strong LD

    No full text
    <div><p>(A) Shows P–P plots of the TW statistic (<i>m</i> = 100, <i>n</i> = 5,000) with large blocks of complete LD. Uncorrected, the TW statistic is hopelessly poor, but after correction the fit is again good. Here, we show 1,000 runs with the same data size parameters as in <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.0020190#pgen-0020190-g002" target="_blank">Figure 2</a>A, <i>m</i> = 500, <i>n</i> = 5,000, varying <i>k,</i> the number of columns used to “correct” for LD. The fit is adequate for any nonzero value of <i>k</i>.</p><p>(B) Shows a similar analysis with <i>m</i> = 200, <i>n</i> = 50,000.</p></div

    The BBP Phase Change

    No full text
    <p>We ran a series of simulations, varying the sample size <i>m</i> and number of markers <i>n</i> but keeping the product at <i>mn</i> = 2<sup>20</sup>. Thus the predicted phase change threshold is <i>F<sub>ST</sub></i> = 2<sup>−10</sup>. We vary <i>F<sub>S</sub></i> and plot the log <i>p</i>-value of the Tracy–Widom statistic. (We clipped −log<sub>10 </sub><i>p</i> at 20.) Note that below the threshold there is no statistical significance, while above threshold, we tend to get enormous significance.</p

    Three African Populations

    No full text
    <p>Plots of the first two eigenvectors for some African populations in the CEPH–HGDP dataset [<a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.0020190#pgen-0020190-b030" target="_blank">30</a>]. Yoruba and Bantu-speaking populations are genetically quite close and were grouped together. The Mandenka are a West African group speaking a language in the Mande family [15, p. 182]. The eigenanalysis fails to find structure in the Bantu populations, but separation between the Bantu and Mandenka with the second eigenvector is apparent.</p

    LD Correction with no LD Present

    No full text
    <p>P–P plots of the TW statistic, when no LD is present and after varying levels (<i>k</i>) of our LD correction. We first show this (A) for <i>m</i> = 500, <i>n</i> = 5,000, and then (B) for <i>m</i> = 200, <i>n</i> = 50,000. In both cases the LD correction makes little difference to the fit.</p

    The Tracy–Widom Density

    No full text
    <p>Conventional percentile points are: <i>P</i> = 0.05, <i>x</i> = .9794; <i>P</i> = 0.01, <i>x</i> = 2.0236; <i>P</i> = 0.001, <i>x</i> = 3.2730.</p

    A Plot of a Simulation Involving Admixture (See Main Text for Details)

    No full text
    <p>We plot the first two principal components. Population C is a recent admixture of two populations, B and a population not sampled. Note the large dispersion of population C along a line joining the two parental populations. Note the similarity of the simulated data to the real data of <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.0020190#pgen-0020190-g005" target="_blank">Figure 5</a>.</p

    Classes of demographic models relating Africans (Y), Europeans (E), and Neandertals (N).

    No full text
    <p>a) Recent gene flow but no ancient structure. RGF I has no bottleneck in E. RGF II has a bottleneck in E after gene flow while RGF VI has a bottleneck in E before the gene flow. RGF IV and V have constant population sizes of N<sub>e</sub> = 5000 and N<sub>e</sub> = 50000 respectively. b) Ancient structure but no recent gene flow. AS I has a constant population size while AS II has a recent bottleneck in E. c) Neither ancient structure nor recent gene flow. NGF I has a constant population size while NGF II has a recent bottleneck in E. d),e) Ancient structure+Recent gene flow. HM IV consists of continuous migration in the Y-E ancestor and the Y-E-N ancestor while HM I consists of continuous migration only in the Y-E ancestor. HM II consist of a single admixture event in the ancestor of E while HM III also models a small population size in one of the admixing populations.</p

    Estimates of the time of gene flow for different demographic models and mutation rates as well as different ascertainments.

    No full text
    <p>The table presents estimates of the time of gene flow for different demographic models and mutation rates as well as different ascertainments. The main classes of models are a) NGF: No gene flow in a randomly mating population; b) AS: Ancient structure, c) RGF : Recent (2,000 generation ago) gene flow from Neandertals (N) into European ancestors (E), d) HM: Hybrid models with ancient structure and recent gene flow and e) Mutation rates that are set to 1×10<sup>−8</sup>/bp/generation and 5×10<sup>−8</sup>/bp/generation. The parameters of the models were chosen to match observed F<sub>ST</sub> between Africans (Y) and Europeans (E) and to match the observed D-statistics of Africans and Europeans relative to Neandertal D(Y,E;N). In all models that involve recent gene flow, the time of gene flow was set to 2,000 generations. Our estimator of the time of gene flow provides accurate estimates of the time of gene flow for a wide range of demographic and mutational parameters. More details on the models and the ascertainments are in <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1002947#pgen-1002947-g002" target="_blank">Figure 2</a>, Figures S2 and S5 in <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1002947#pgen.1002947.s001" target="_blank">Text S1</a>.</p
    corecore