8 research outputs found

    SNP calling accuracy.

    No full text
    <p>False Positive and False Negative rates in the identification of polymorphic sites under different experimental scenarios. Simulations were performed as described in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0079667#pone-0079667-g001" target="_blank">Figure 1</a>. Sites were identified as polymorphic if their probability of being variable was above a threshold, chosen to minimise the difference between the true and the estimated number of SNPs (see <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0079667#s4" target="_blank">Methods</a>).</p

    Population structure inference accuracy.

    No full text
    <p>Accuracy of population structure inference, measured as the proportion of cells over a grid where sub-populations have been wrongly assigned from sequencing data compared to the case of known genotypes for all individuals (see <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0079667#s4" target="_blank">Methods</a>). Sequencing depths are , , , and and the corresponding sample sizes are , 60, 12, and 6 individuals. I simulated independent sites, with a probability of each site being variable in the population equal to 0.1. Populations were simulated with high genetic subdivision (left panel, 0.4 and 0.1), medium genetic subdivision (mid panel, 0.3 and 0.05), low genetic subdivision (right panel, 0.1 and 0.02).</p

    Population structure prediction.

    No full text
    <p>Population structure predicted over a grid for a single replicate under different experimental scenarios. Simulations were performed as described in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0079667#pone-0079667-g003" target="_blank">Figure 3</a>, in the case of low genetic subdivision. Grey cells represent locations where a different sub-population was predicted to be located from sequencing data compared to the case of known genotypes of all individuals. These particular replicates show a proportion of mislabelled cells equal to be the medium of the distribution. Note that replicates are not the same across the different tested scenarios.</p

    Nucleotide diversity estimation.

    No full text
    <p>Bias in the estimate of the number of segregating sites (left panel) and the expected heterozygosity (right panel) under different experimental scenarios. Sequencing depths are , , , and and the corresponding sample sizes are , , , and individuals. I simulated 100 regions of independent sites, with a probability of each site being variable in the population equal to 0.1.</p

    Simultaneous Inference of Past Demography and Selection from the Ancestral Recombination Graph under the Beta Coalescent

    No full text
    The reproductive mechanism of a species is a key driver of genome evolution. The standard Wright-Fisher model for the reproduction of individuals in a population assumes that each individual produces a number of offspring negligible compared to the total population size. Yet many species of plants, invertebrates, prokaryotes or fish exhibit neutrally skewed offspring distribution or strong selection events yielding few individuals to produce a number of offspring of up to the same magnitude as the population size. As a result, the genealogy of a sample is characterized by multiple individuals (more than two) coalescing simultaneously to the same common ancestor. The current methods developed to detect such multiple merger events do not account for complex demographic scenarios or recombination, and require large sample sizes. We tackle these limitations by developing two novel and different approaches to infer multiple merger events from sequence data or the ancestral recombination graph (ARG): a sequentially Markovian coalescent (SMβC) and a graph neural network (GNNcoal). We first give proof of the accuracy of our methods to estimate the multiple merger parameter and past demographic history using simulated data under the β-coalescent model. Secondly, we show that our approaches can also recover the effect of positive selective sweeps along the genome. Finally, we are able to distinguish skewed offspring distribution from selection while simultaneously inferring the past variation of population size. Our findings stress the aptitude of neural networks to leverage information from the ARG for inference but also the urgent need for more accurate ARG inference approaches.</p
    corecore