21 research outputs found

    A Wicked Forest

    No full text
    <div><p>(a) The two long internal branches have length 2, and the two short internal branches have length 0.1. For this species tree the probabilities that a random gene tree has topology <i>Ļˆ</i><sub>i</sub> are 0.085 and 0.103 for <i>i</i> = 1 and <i>i</i> = 2, respectively. Hence <i>Ļˆ</i><sub>2</sub> is anomalous for <i>Ļƒ</i><sub>1</sub>.</p><p>(b) The one long internal branch has length 4, the shortest internal branch has length 0.1, and the other two internal branches have length 0.3. For this species tree, the gene tree probabilities are 0.066 and 0.060 for topologies <i>Ļˆ</i><sub>1</sub> and <i>Ļˆ</i><sub>2</sub>, respectively. Note that the two topologies disagree only on the placement of taxon D and that neither is 6-maximally probable.</p></div

    The Production of Anomalies for <i>n</i>-Maximally Probable Species Tree Topologies with <i>n</i> = 5,6,7,8 (See Table 1)

    No full text
    <p>The branch lengths <i>x</i>, <i>y</i>, and <i>Ī»</i> apply to each tree: in (a) and (b), <i>x</i> + <i>y</i> denotes the length of the red internal branch, and in (c) and (d), <i>x</i> and <i>y</i> are the lengths of the deeper and shallower red internal branches, respectively; the length <i>Ī»</i> denotes the branch length between the root of the species tree and the MRCA of species A and B. For each tree, the color of a branch represents the probability that coalescences occur on the branch. On an external branch, because there is only one gene lineage, coalescences cannot occur. Prior to the root, the probability is 1 that all lineages coalesce. During the time between the root of the species tree and the divergence of A and Bā€”and of C and D in (bā€“d)ā€”the probability that any coalescences occur can be made arbitrarily close to 0 by making the internal branches sufficiently short. Similarly, by choosing <i>x</i> and <i>y</i> to be sufficiently large, the probability that all available lineages coalesce on the red branches can be made arbitrarily close to 1. In (a), the species tree can be represented as (((AB)C)Z), where Z is (DE). By making the internal branch ancestral to D and E long, the subtree Z is similar to a single taxon, and the five-taxon tree behaves like the four-taxon asymmetric tree (((AB)C)Z), which produces the anomaly ((AB)(CZ)). Thus, in (a), the AGT is ((AB)(C(DE))). Similarly, the species tree topologies in (b), (c), and (d) have the form (((AB)(CD))Z) and produce anomalies (((AB)C)(DZ)); in (b), (c), and (d) Z is (EF), (E(FG)), and ((EF)(GH)), respectively. The anomalies occur by letting internal branches in subtrees ((AB)(CD)) and Z be sufficiently short and long, respectively.</p

    The Anomaly Zone for the Four-Taxon Asymmetric Species Tree Topology

    No full text
    <p>Branch lengths <i>x</i> and <i>y</i> (see <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.0020068#pgen-0020068-g001" target="_blank">Figure 1</a>) are measured in coalescent time units.</p

    Inferred Population Structure Based on Two Different Sets of 100 Individuals, Using 993 Markers and the Correlated Allele Frequencies Model

    No full text
    <p>The two sets of 100 individuals represent extremes of the distribution of <i>A<sub>n</sub>:</i> the plots on the left are based on a more geographically random sample, and those on the right are based on a less random sample. Each plot is based on the higher-likelihood run among the two runs performed with the given combination of loci and individuals. In all plots, individuals and populations are in the same order as in <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.0010070#pgen-0010070-g002" target="_blank">Figure 2</a>. Black vertical lines at the bottom of the figure separate populations from the different geographic regions described in [<a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.0010070#pgen-0010070-b003" target="_blank">3</a>], with the asterisk representing Oceania.</p

    Inferred Population Structure Based on 1,048 Individuals and 993 Markers, Assuming Correlations among Allele Frequencies across Clusters

    No full text
    <p>Each individual is represented by a thin line partitioned into <i>K</i> colored segments that represent the individual's estimated membership fractions in <i>K</i> clusters. Each plot, produced with DISTRUCT [<a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.0010070#pgen-0010070-b023" target="_blank">23</a>], is based on the highest-likelihood run of ten runs: the two runs that were used in further analysis, and the eight runs described under ā€œCluster Analysis using STRUCTURE.ā€ As in [<a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.0010070#pgen-0010070-b003" target="_blank">3</a>], four of ten runs with <i>K</i> = 3 separated a cluster corresponding to East Asia instead of one corresponding to Europe, the Middle East, and Central/South Asia. Two of ten runs with <i>K</i> = 5 separated Surui instead of Oceania. The highest-likelihood run of the ten runs with <i>K</i> = 6, shown in the figure, had a different pattern from the other nine runs (not shown). These other runs, instead of subdividing native Americans into two clusters, subdivided a cluster roughly similar to the Kalash cluster seen in [<a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.0010070#pgen-0010070-b003" target="_blank">3</a>], except with a less pronounced separation of the Kalash population. The clusteredness scores for the plots shown with <i>K</i> = 2, 3, 4, 5, and 6 are 0.50, 0.76, 0.84, 0.86, and 0.87, respectively.</p

    Mean Clusteredness versus Number of Loci

    No full text
    <p>Each point shows the mean clusteredness of 2,000 runs with the specified sample size and allele frequency correlation model: two replicates for each of ten sets of loci for each of 100 sets of individuals (for 1,048 individuals, it is the mean of 20 runs, as only one set of individuals was used; for 1,048 individuals and 993 loci, it is the mean of two runs, as only one set of loci was used). Error bars denote standard deviations. The <i>x</i>-axis is plotted on a logarithmic scale.</p

    Genetic and Geographic Distance for Pairs of Populations

    No full text
    <p>Red circles indicate comparisons between pairs of populations with majority representation in the same cluster in the <i>K</i> = 5 plot of <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.0010070#pgen-0010070-g002" target="_blank">Figure 2</a>; blue triangles indicate pairs with one population from Eurasia and one from East Asia; brown squares indicate pairs with one population from Africa and the other from Eurasia; and green diamonds indicate pairs with one population from East Asia and the other from either Oceania or America. Comparisons involving one of Hazara, Kalash, and Uygur and other populations from Eurasia or East Asia are marked 1, 2, and 3, respectively. No comparisons are shown between any of these three groups and any African population.</p

    Mean Clusteredness versus Geographic Dispersion as Measured by <i>A<sub>n</sub></i>

    No full text
    <p>Each point shows the mean clusteredness of 20 runs with the specified number of loci and allele frequency correlation model: two replicates for each of ten sets of loci (for 993 loci, it is the mean of two runs, as only one set of loci was used). From left to right, the three groups of points in each plot respectively represent sets of 100, 250, and 500 individuals.</p

    Sample Sizes and Geographic Origins of Samples

    No full text
    <p>The latitudes and longitudes used for the various groups are given in <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.0020215#pgen-0020215-st002" target="_blank">Table S2</a>.</p
    corecore