18 research outputs found

    Simulations confirm that demographic events shape the effect of background selection (BGS).

    No full text
    <p>(A) Inferred demographic model from Complete Genomics TGP data showing population size changes for Africans (AFR), Europeans (EUR), and East Asians (EASN) as a function of time that was used for the simulations of BGS. (B) Simulated diversity at neutral sites across populations as a function of time under our inferred demographic model without BGS (π<sub>0</sub>—dashed colored lines) and with BGS (π—solid colored lines). (C) Relative diversity (π/π<sub>0</sub>) measured by taking the ratio of diversity with BGS (π) to diversity without BGS (π<sub>0</sub>) at each time point. Note that the x-axes in all three figures are on the same scale. Time is scaled using a human generation time of 25 years per generation. Simulation data was sampled every 100 generations (see <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1007387#pgen.1007387.s008" target="_blank">S5 Table</a> for exact values of mean π).</p

    Regression coefficient estimates for linear regression of <i>F</i><sub>ST</sub> on 2% quantile bins of <i>B</i>.

    No full text
    <p>Regression coefficient estimates for linear regression of <i>F</i><sub>ST</sub> on 2% quantile bins of <i>B</i>.</p

    Human demographic history has amplified the effects of background selection across the genome

    No full text
    <div><p>Natural populations often grow, shrink, and migrate over time. Such demographic processes can affect genome-wide levels of genetic diversity. Additionally, genetic variation in functional regions of the genome can be altered by natural selection, which drives adaptive mutations to higher frequencies or purges deleterious ones. Such selective processes affect not only the sites directly under selection but also nearby neutral variation through genetic linkage via processes referred to as genetic hitchhiking in the context of positive selection and background selection (BGS) in the context of purifying selection. While there is extensive literature examining the consequences of selection at linked sites at demographic equilibrium, less is known about how non-equilibrium demographic processes influence the effects of hitchhiking and BGS. Utilizing a global sample of human whole-genome sequences from the Thousand Genomes Project and extensive simulations, we investigate how non-equilibrium demographic processes magnify and dampen the consequences of selection at linked sites across the human genome. When binning the genome by inferred strength of BGS, we observe that, compared to Africans, non-African populations have experienced larger proportional decreases in neutral genetic diversity in strong BGS regions. We replicate these findings in admixed populations by showing that non-African ancestral components of the genome have also been affected more severely in these regions. We attribute these differences to the strong, sustained/recurrent population bottlenecks that non-Africans experienced as they migrated out of Africa and throughout the globe. Furthermore, we observe a strong correlation between <i>F</i><sub>ST</sub> and the inferred strength of BGS, suggesting a stronger rate of genetic drift. Forward simulations of human demographic history with a model of BGS support these observations. Our results show that non-equilibrium demography significantly alters the consequences of selection at linked sites and support the need for more work investigating the dynamic process of multiple evolutionary forces operating in concert.</p></div

    Normalized and relative diversity for Thousand Genomes Project (TGP) continental groups.

    No full text
    <p>(A) Normalized diversity (π/divergence) measured across the lowest 1%, 5%, 10% and 25% <i>B</i> quantile bins (strong BGS) and the highest 1% <i>B</i> quantile bin (weak BGS). (B) Relative diversity: the ratio of normalized diversity in the lowest <i>B</i> quantile bins (strong BGS) in (A) to normalized diversity in the highest 1% <i>B</i> quantile bin (weak BGS). Error bars represent ±1 SEM calculated from 1,000 bootstrapped datasets. See <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1007387#pgen.1007387.s004" target="_blank">S1 Table</a> for underlying data.</p

    <i>F</i><sub>ST</sub> is correlated with <i>B</i>.

    No full text
    <p><i>F</i><sub>ST</sub> between TGP populations measured across 2% quantile bins of <i>B</i>. Smaller transparent points and lines show the estimates and corresponding lines of best fit (using linear regression) for <i>F</i><sub>ST</sub> between every pairwise population comparison within a particular pair of continental groups (25 pairwise comparisons each). Larger opaque points and lines are mean <i>F</i><sub>ST</sub> estimates and lines of best fit across all population comparisons within a particular pair of continental groups. Error bars represent ±1 SEM calculated from 1,000 bootstrapped datasets.</p

    Normalized diversity and relative diversity for non-admixed populations of the Thousand Genomes Project (TGP).

    No full text
    <p>(A) Normalized diversity (π/divergence) measured across the lowest 1% <i>B</i> quantile bin (strong BGS). (B) Normalized diversity measured across the highest 1% <i>B</i> quantile bin (weak BGS). (C) Relative diversity: the ratio of normalized diversity in the lowest 1% <i>B</i> bin to normalized diversity in the highest 1% <i>B</i> bin (π/π<sub>min</sub>). TGP population labels are indicated below each bar (see Table L in <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1007387#pgen.1007387.s003" target="_blank">S1 Text</a> for population label descriptions), with African populations colored by gold shades, European populations colored by blue shades, South Asian populations colored by violet shades, and East Asian populations colored by green shades. Error bars represent ±1 SEM calculated from 1,000 bootstrapped datasets. See <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1007387#pgen.1007387.s004" target="_blank">S1 Table</a> for underlying data.</p

    Amino acid mutation signatures for individual samples.

    No full text
    <p>(A) A heatmap representation of the six-component NMF clustering results for individual cancer samples (only those with >10 total nonsynonymous mutations). Samples with the same maximum signature component were grouped and sorted. Four amino acid mutation signatures identified (R>H, E>K, E>K/E>Q, Complex 2) overlap with signatures in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0183273#pone.0183273.g001" target="_blank">Fig 1A</a>. Color scale represents scaled contribution of each signature for a given sample. Signature and NMF fit details can be found in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0183273#pone.0183273.s003" target="_blank">S3 Fig</a>. (B) Bars show the total fraction of individual samples with a majority of a particular signature within each cancer. Within cancers, a large fraction of individual samples tend to have similar signature components.</p

    Prominent features of the amino acid mutation landscape in cancer

    Get PDF
    <div><p>Cancer can be viewed as a set of different diseases with distinctions based on tissue origin, driver mutations, and genetic signatures. Accordingly, each of these distinctions have been used to classify cancer subtypes and to reveal common features. Here, we present a different analysis of cancer based on amino acid mutation signatures. Non-negative Matrix Factorization and principal component analysis of 29 cancers revealed six amino acid mutation signatures, including four signatures that were dominated by either arginine to histidine (Arg>His) or glutamate to lysine (Glu>Lys) mutations. Sample-level analyses reveal that while some cancers are heterogeneous, others are largely dominated by one type of mutation. Using a non-overlapping set of samples from the COSMIC somatic mutation database, we validate five of six mutation signatures, including signatures with prominent arginine to histidine (Arg>His) or glutamate to lysine (Glu>Lys) mutations. This suggests that our classification of cancers based on amino acid mutation patterns may provide avenues of inquiry pertaining to specific protein mutations that may generate novel insights into cancer biology.</p></div

    Normalized NMF mixture coefficients for individual samples.

    No full text
    <p>Plot of the normalized mixture coefficients across the three mutation signatures with high R>H or E>K components for every individual sample. Colors represent the greatest contributing mutation signature for each sample based on the full individual-level NMF analysis. Here we see a dramatic separation of samples in the E>K component to the near exclusion of other signatures.</p
    corecore