20 research outputs found

    Illustration of the painting process to create the coancestry matrix.

    No full text
    <p>We show the process by which a haplotype (haplotype 1, black) is painted using the others. A) True underlying genealogies for eight simulated sequences at three locations along a genomic segment, produced using the program ‘ms’ <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1002453#pgen.1002453-Hudson1" target="_blank">[52]</a> and showing coalescence times between haplotypes at each position. B) The Time to the Most Recent Common Ancestor (TMRCA) between haplotype 1 and each other haplotype, as a function of sequence position. Note multiple haplotypes can share the same TMRCA and changes in TMRCA correspond to historical recombination sites. C) True distribution of the ‘nearest neighbour’ haplotype. D) Sample ‘paintings’ of the Li & Stephens algorithm. E) Expectation of the painting process, estimating the nearest neighbour distribution. F) Resulting row of the coancestry matrix, based on the expectation of the painting.</p

    World HGDP results summary.

    No full text
    <p>A) Relationship between populations for the whole world data. Each tip corresponds to a population; labels include the number of individuals and are coloured red if all individuals within that label are found in a single clade. See text for an interpretation of the values on the edges; the cut defines the ‘sub-continents’ discussed in the text. B) Transposed coancestry matrix for the Hazara and Burusho (in full: <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1002453#pgen.1002453.s014" target="_blank">Figure S14</a>), showing CentralSouthAsia and EastAsia donors, which are each normalised to have mean donation rate of 1. The box shows the ‘diagonal’ drift component.</p

    Half-matching using correlations for HGDP data.

    No full text
    <p>For each continent, we show the proportion of times in which two sets of chromosomes of a particular individual are matched correctly based on similarity of their coancestry profile. Coancestry profiles are calculated using a training set as described in the text. Results for coancestry matrices are calculated using correlation between individuals based on the linked and unlinked models. Also shown are the expected success in clustering if individuals within the same label or same inferred (linked results) fineSTRUCTURE population each had the same ancestry profile.</p

    Simulated data scenario and painting results.

    No full text
    <p>A) Effective population size and B) population splits used for creating the simulated data. C) Coancestry heatmaps for linked and unlinked model with regions and 20 individuals per population, showing for (bottom left) the unlinked model, and (top right) the linked model; note that the linked heatmap is slightly asymmetric. D) PCA applied to the dataset using Eigenstrat on the raw SNP data. E) PCA on the coancestry matrix assuming markers are unlinked and F) linked (see text for details).</p

    PCA for East Asia HGDP data.

    No full text
    <p>The first 2 PCA components of the East Asian ‘continent’ as defined in <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1002453#pgen.1002453.s041" target="_blank">Table S1</a> are shown for A) the linked model and B) the unlinked model. Only the named labels are displayed for clarity; <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1002453#pgen.1002453.s037" target="_blank">Figure S37</a> shows the full set. Further structure will be present in other principal components (not shown).</p

    Coancestry heat map for the Europe sub-continent.

    No full text
    <p>A) (bottom left) population averages, (top right) the raw data matrix, and (left) chunks from other sub-continents. To symmetrise the matrices we show the average of the donor/recipient chunk counts; read the row <i>and</i> column for an individual to see their full profile. The tree has the same interpretation as <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1002453#pgen-1002453-g004" target="_blank">Figure 4</a>, and the heatmap between individuals in Europe has the same interpretation as <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1002453#pgen-1002453-g002" target="_blank">Figure 2C</a>, with extremely high (black) and low (white) values capped. Each continent has its own scale (top), with the lowest value in yellow and the highest in blue. B) ADMIXTURE barplot for the same dataset.</p

    Principal components 2 and 3 of combined Irish and British coancestry matrix.

    No full text
    <p>(<b>A</b>) fineSTRUCTURE clustering dendrogram for combined Irish and British data, with cluster groups defined as in <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1007152#pgen.1007152.g002" target="_blank">Fig 2</a>. Immediately following the principal inter-island split, Orkney and Wales branch in sequence, consistent with previous observations. (<b>B</b>) Principal component analysis (PCA) of haplotypic similarity based on the ChromoPainter coancestry matrix, coloured by cluster group with their median locations labelled. PC2 captures an Orkney split, while PC3 captures a Welsh split.</p

    All-Ireland GLOBETROTTER admixture date estimates for European and British surrogate admixing populations.

    No full text
    <p>A summary of the date estimates and 95% confidence intervals for inferred admixture events into Ireland from European and British admixing sources is shown in (<b>A</b>), with ancestry proportion estimates for each historical source population for the two events and example coancestry curves shown in (<b>B</b>). In the coancestry curves <i>Relative joint probability</i> estimates the pairwise probability that two haplotype chunks separated by a given genetic distance come from the two modelled source populations respectively (i.e. FRA(8) and NOR-SG); if a single admixture event occurred, these curves are expected to decay exponentially at a rate corresponding to the number of generations since the event. The green fitted line describes this GLOBETROTTER fitted exponential decay for the coancestry curve. If the sources come from the same ancestral group the slope of this curve will be negative (as with FRA(8) vs FRA(8)), while a positive slope indicates that sources come from different admixing groups (as with FRA(8) vs NOR-SG). The adjacent bar plot shows the inferred genetic composition of the historical admixing sources modelled as a mixture of the sampled modern populations. A European admixture event was estimated by GLOBETROTTER corresponding to the historical record of the Viking age, with major contributions from sources similar to modern Scandinavians and northern Europeans and minor contributions from southern European-like sources. For admixture date estimates from British-like sources the influence of the Norman settlement and the Plantations could not be disentangled, with the point estimate date for admixture falling between these two eras and GLOBETROTTER unable to adequately resolve source and proportion details of admixture event (fit quality FQ<sub>B</sub>< 0.985). The relative noise of the coancestry curves reflects the uncertainty of the British event. Cluster labels (for the European clustering dendrogram, see <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1007152#pgen.1007152.s004" target="_blank">S4 Fig</a>; for the PoBI clustering dendrogram, see <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1007152#pgen.1007152.s003" target="_blank">S3 Fig</a>): FRA(8), France cluster 8; NOR-SG, Norway, with significant minor representations from Sweden and Germany; SE_ENG, southeast England; N_SCOT(4) northern Scotland cluster 4.</p

    Genes mirror geography in the British Isles.

    No full text
    <p>(A) fineSTRUCTURE clustering dendrogram for combined Irish and British data. Data principally split into Irish and British groups before subdividing into a total of 50 distinct clusters, which are combined into cluster groups for clusters that formed clades in the dendrogram, overlapped in principal component space (B) and were sampled from regions that are geographically contiguous. Names and labels follow the geographical provenance for the majority of data within the cluster group. Details for each cluster in the dendrogram are provided in <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1007152#pgen.1007152.s002" target="_blank">S2 Fig</a>. (B) Principal component analysis (PCA) of haplotypic similarity based on the ChromoPainter coancestry matrix, coloured by cluster group with their median locations labelled. We have chosen to present PC1 versus PC4 here as these components capture new information regarding correlation between haplotypic variation across Britain and Ireland and geography, while PC2 and PC3 (<a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1007152#pgen.1007152.g004" target="_blank">Fig 4</a>) capture previously reported splitting for Orkney and Wales, respectively, from Britain [<a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1007152#pgen.1007152.ref007" target="_blank">7</a>]. A map of Ireland and Britain is shown for comparison, coloured by sampling regions for cluster groups, the boundaries of which are defined based on the Nomenclature of Territorial Units for Statistics (NUTS 2010), with some regions combined. Sampling regions are coloured by the cluster group with the majority presence in the sampling region; some sampling regions have significant minority cluster group representations as well, for example the Northern Ireland sampling region (UKN0; NUTS 2010) is majorly explained by the NICS cluster group but also has significant representation from the NLU cluster group. The PCA plot has been rotated clockwise by 5 degrees to highlight its similarity with the geographical map of the Ireland and Britain. NI, Northern Ireland; PC, principal component. Cluster groups that share names with groups from <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1007152#pgen.1007152.g001" target="_blank">Fig 1</a> (NLU; SMN; CLN; CNN) have an average of 80% of their samples shared with the initial cluster groups. The map and administrative boundaries were produced using data from the database of Global Administrative Areas (GADM; <a href="https://gadm.org/" target="_blank">https://gadm.org</a>), note some boundaries have been subsumed or modified to better reflect sampling regions.</p

    Inter-island exchange of haplotypes between the north of Ireland and northern Britain.

    No full text
    <p>The boxplots show the distribution of individuals on principal component (PC) 1 for each island and for specific sampling regions (Scotland/Northern Ireland) and cluster groups (SSC and NICS; see <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1007152#pgen.1007152.g002" target="_blank">Fig 2</a>). A substantial proportion of Northern Irish individuals fall within the expected range for Scottish individuals in PC space and <i>vice versa</i>. This exchange is particularly pronounced for Northern Irish and Scottish individuals that fall within the NICS and SSC cluster groups (<a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1007152#pgen.1007152.g002" target="_blank">Fig 2</a>), respectively.</p
    corecore