21 research outputs found

    Illustration of the painting process to create the coancestry matrix.

    No full text
    <p>We show the process by which a haplotype (haplotype 1, black) is painted using the others. A) True underlying genealogies for eight simulated sequences at three locations along a genomic segment, produced using the program ‘ms’ <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1002453#pgen.1002453-Hudson1" target="_blank">[52]</a> and showing coalescence times between haplotypes at each position. B) The Time to the Most Recent Common Ancestor (TMRCA) between haplotype 1 and each other haplotype, as a function of sequence position. Note multiple haplotypes can share the same TMRCA and changes in TMRCA correspond to historical recombination sites. C) True distribution of the ‘nearest neighbour’ haplotype. D) Sample ‘paintings’ of the Li & Stephens algorithm. E) Expectation of the painting process, estimating the nearest neighbour distribution. F) Resulting row of the coancestry matrix, based on the expectation of the painting.</p

    Half-matching using correlations for HGDP data.

    No full text
    <p>For each continent, we show the proportion of times in which two sets of chromosomes of a particular individual are matched correctly based on similarity of their coancestry profile. Coancestry profiles are calculated using a training set as described in the text. Results for coancestry matrices are calculated using correlation between individuals based on the linked and unlinked models. Also shown are the expected success in clustering if individuals within the same label or same inferred (linked results) fineSTRUCTURE population each had the same ancestry profile.</p

    World HGDP results summary.

    No full text
    <p>A) Relationship between populations for the whole world data. Each tip corresponds to a population; labels include the number of individuals and are coloured red if all individuals within that label are found in a single clade. See text for an interpretation of the values on the edges; the cut defines the ‘sub-continents’ discussed in the text. B) Transposed coancestry matrix for the Hazara and Burusho (in full: <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1002453#pgen.1002453.s014" target="_blank">Figure S14</a>), showing CentralSouthAsia and EastAsia donors, which are each normalised to have mean donation rate of 1. The box shows the ‘diagonal’ drift component.</p

    PCA for East Asia HGDP data.

    No full text
    <p>The first 2 PCA components of the East Asian ‘continent’ as defined in <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1002453#pgen.1002453.s041" target="_blank">Table S1</a> are shown for A) the linked model and B) the unlinked model. Only the named labels are displayed for clarity; <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1002453#pgen.1002453.s037" target="_blank">Figure S37</a> shows the full set. Further structure will be present in other principal components (not shown).</p

    Simulated data scenario and painting results.

    No full text
    <p>A) Effective population size and B) population splits used for creating the simulated data. C) Coancestry heatmaps for linked and unlinked model with regions and 20 individuals per population, showing for (bottom left) the unlinked model, and (top right) the linked model; note that the linked heatmap is slightly asymmetric. D) PCA applied to the dataset using Eigenstrat on the raw SNP data. E) PCA on the coancestry matrix assuming markers are unlinked and F) linked (see text for details).</p

    Coancestry heat map for the Europe sub-continent.

    No full text
    <p>A) (bottom left) population averages, (top right) the raw data matrix, and (left) chunks from other sub-continents. To symmetrise the matrices we show the average of the donor/recipient chunk counts; read the row <i>and</i> column for an individual to see their full profile. The tree has the same interpretation as <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1002453#pgen-1002453-g004" target="_blank">Figure 4</a>, and the heatmap between individuals in Europe has the same interpretation as <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1002453#pgen-1002453-g002" target="_blank">Figure 2C</a>, with extremely high (black) and low (white) values capped. Each continent has its own scale (top), with the lowest value in yellow and the highest in blue. B) ADMIXTURE barplot for the same dataset.</p

    Human recombination hot spots hidden within regions of strong marker association

    Full text link
    The fine-scale distribution of meiotic recombination events in the human genome can be inferred from patterns of haplotype diversity in human populations but only directly studied by high-resolution sperm typing. Both approaches indicate that crossovers are heavily clustered into narrow recombination hot spots. However, our direct understanding of hot-spot properties and distributions is largely limited to sperm typing in the major histocompatibility complex (MHC). We now describe the analysis of an unremarkable 206 kb region on human chromosome 1, revealing localised regions of linkage disequilibrium (LD) breakdown that mark the locations of sperm crossover hot spots. The distribution, intensity and morphology of these hot spots are strikingly similar to those in the MHC. However, we also accidentally detected additional hot spots within regions of strong association. Coalescent analysis of genotype data detected most of the hot spots, but revealed significant differences between sperm crossover frequencies and “historical” recombination rates. This raises the possibility that some hot spots, in particular those in regions of strong association, may have evolved very recently and not left their full imprint on haplotype diversity. These results suggest that hot spots could be very abundant and possibly fluid features of the human genome

    Marginal Significance (−log<sub>10</sub> p-value as Determined by <i>t-</i>Test) of the Wavelet Coefficients from Four Annotations as Predictors of the Coefficients of the Decomposition of Human-Chimpanzee Divergence

    No full text
    <div><p>Red boxes highlight significant positive linear relationships, and blue boxes, negative. The intensity of the colour is proportional to the degree of significance.</p><p>(A) Smoothed coefficients.</p><p>(B) Detail coefficients.</p></div

    Quantile-Quantile Plots Showing the Difference in Allele Frequency Spectrum for AT→GC Mutations and GC→AT Mutations in Regions of Low and High Recombination

    No full text
    <p>If the two types of mutation were to have the same allele frequency distribution, we would expect to see a straight line. In both cases, AT→GC mutations are typically at higher frequencies than GC→AT mutations; however, the effect is more pronounced in regions of high recombination [(A), low recombination; (B), high recombination]. A quantification of the difference can be found in the text and supporting material.</p

    Power Spectra and Pairwise Correlations of Detail Wavelet Coefficients

    No full text
    <p>Diagonal plots show the power spectrum of the wavelet decomposition of each factor on the long (red) and short (blue) arms of Chromosome 20. Off-diagonal plots show the rank correlation coefficient between pairs of detail wavelet coefficients at each scale on the long (top right) and short (bottom left) arms. Red crosses indicate significant correlations (<i>p</i>-value < 0.01; Kendall's rank correlation). Scale is shown in kilobases.</p
    corecore