Search CORE

24 research outputs found

Performance of the model for estimating branch-specific sex ratios.

Author: Florian Clemente (3298086)
Mathieu Gautier (47253)
Renaud Vitalis (376558)
Publication venue
Publication date
Field of study

All histories represented from A to D share the same topology ((1,2),3) but differ with respect to the simulated ESR. The root population was made of 50,000 males and 50,000 females, and each branch in the topology corresponds to a population made of 500 males and 500 females (A). In (B) branch 2 was made of 250 females and 750 males (ξ2 = 0.25); in (C) branch 4 was made of 250 females and 750 males (ξ4 = 0.25); in (D) branch 3 was made of 250 females and 750 males (ξ3 = 0.25). Inset trees indicate which branch was simulated with a biased sex ratio. The two successive splits occurred 200 and 400 generations before present time. The mutation rate was fixed at μ = 5 × 10−7. 50 females per population were sampled for each dataset. We analyzed 50 replicate simulated datasets for each scenario, with 5,000 autosomal SNPs and 5,000 X-linked SNPs. The boxplots summarize the distributions of the 50 posterior means of ξi for each of the four branches. The horizontal dashed segments indicate the true (simulated) values of ξi. The pie-charts indicate the fraction of significant support values (S < 0.01), against the hypothesis ξ = 0.5 (see <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1007191#pgen.1007191.e028" target="_blank">Eq 4</a>).</p

FigShare

Directed acyclic graph (DAG) of the hierarchical Bayesian model for a three-population example tree.

Author: Florian Clemente (3298086)
Mathieu Gautier (47253)
Renaud Vitalis (376558)
Publication venue
Publication date
Field of study

The square nodes characterize the data, i.e. represents the observed allele counts from autosomal and X-linked data in population i at SNP j. The circles and rounded rectangles represent the parameters to be estimated: is the (unknown) allele frequency in population i; is the length (in a diffusion time scale) of the branch leading to population i; α(Ω) and β(Ω) are the shape and scale parameters of the beta distribution, which describes the allele frequency distribution in the root population. Unidirectional edges (arrows) represent direct stochastic relationships within the model. They indicate the conditional dependency between connected nodes.</p

FigShare

Application example on human (HapMap) data.

Author: Florian Clemente (3298086)
Mathieu Gautier (47253)
Renaud Vitalis (376558)
Publication venue
Publication date
Field of study

We re-analyzed the dataset from Keinan et al. [<a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1007191#pgen.1007191.ref019" target="_blank">19</a>, <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1007191#pgen.1007191.ref042" target="_blank">42</a>], with genotypes from European American individuals from Utah, USA (CEU), Asian individuals grouping Han Chinese from Beijing and Japanese from Tokyo (ASN) and Yoruba individuals from Ibadan, Nigeria (YRI) (see the <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1007191#sec015" target="_blank">Materials and methods</a> section). The data consisted of 340,909 autosomal SNPs and 12,737 X-linked SNPs. For both genetic systems, we randomly subsampled 50 pseudo-replicated datasets from the full data, each made of 5,000 autosomal SNPs and 5,000 X-linked SNPs. We ran KimTree conditionally on the ((CEU,ASN),YRI) topology, represented in (A) with branch lengths estimates corresponding to the posterior means of . (B) The boxplots summarize the distributions of the posterior means of the ESR for each branch in the tree, for the 50 pseudo-replicated datasets. The dotted line indicates the expectation for a balanced ESR (ξi = 0.5). The pie-charts indicate the fraction of significant support values (S < 0.01) against the hypothesis ξ = 0.5 (see <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1007191#pgen.1007191.e028" target="_blank">Eq 4</a>).</p

FigShare

Performance of the model for estimating branch-specific sex ratios in a four-population tree.

Author: Florian Clemente (3298086)
Mathieu Gautier (47253)
Renaud Vitalis (376558)
Publication venue
Publication date
Field of study

We simulated a four-population tree with topology ((1,2),(3,4)). The root population was made of 50,000 males and 50,000 females, and the internal branches correspond to populations made of 5,000 males and 5,000 females. As depicted in (A), branch 1 was made of = 1,000 females and males (ξ1 = 0.1); branch 2 was made of females and males (ξ2 = 0.2); branch 3 was made of females and males (ξ3 = 0.9); branch 4 was made of females and males (ξ4 = 0.8). The two successive splits occurred 1,000 and 3,000 generations before present time. The mutation rate was fixed at μ = 1.5 × 10−7. 50 females per population were sampled for each dataset. We analyzed 50 replicate simulated datasets of each scenario, with 5,000 autosomal SNPs and 5,000 X-linked SNPs. The boxplots in (B) summarize the distributions of the 50 posterior means of ξi for each of the six branches. The horizontal dashed segments indicate the true (simulated) values of ξi. The pie-charts indicate the fraction of significant support values (S < 0.01), against the hypothesis ξ = 0.5 (see <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1007191#pgen.1007191.e028" target="_blank">Eq 4</a>).</p

FigShare

Application example on whole-genome human sequence data.

Author: Florian Clemente (3298086)
Mathieu Gautier (47253)
Renaud Vitalis (376558)
Publication venue
Publication date
Field of study

We re-analyzed a subset of the whole-genome sequence data from Pagani et al. [<a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1007191#pgen.1007191.ref033" target="_blank">33</a>], with populations from NW-Europe (NWE), SE-Asia (SEA), Oceania (OCE) and Americas (AME) (see the <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1007191#sec015" target="_blank">Materials and methods</a> section for a detailed composition of populations). For both genetic systems, we randomly subsampled 50 pseudo-replicated datasets from the full data, each made of 5,000 autosomal SNPs and 5,000 X-linked SNPs. We ran KimTree considering the best fitting tree topology (NWE,SEA,OCE,AME) (see the <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1007191#sec015" target="_blank">Materials and methods</a> section), represented in (A) with branch lengths estimates corresponding to the posterior means of . (B) The boxplots summarize the distributions of the posterior means of the ESR for each branch in the tree, for the 50 pseudo-replicated datasets. The dotted line indicates the expectation for a balanced ESR (ξi = 0.5). The pie-charts indicate the fraction of significant support values (S < 0.01) against the hypothesis ξ = 0.5 (see <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1007191#pgen.1007191.e028" target="_blank">Eq 4</a>).</p

FigShare

Application example on cattle data.

Author: Florian Clemente (3298086)
Mathieu Gautier (47253)
Renaud Vitalis (376558)
Publication venue
Publication date
Field of study

We analyzed 643,090 autosomal SNPs and 15,009 X-linked SNPs from a dairy cattle breed (HOL), the Angus beef cattle breed (ANG), the N’Dama breed (NDA). For both genetic systems, we randomly subsampled 50 pseudo-replicated datasets from the full data, each made of 5,000 autosomal SNPs and 5,000 X-linked SNPs. We ran KimTree considering the tree topology: ((HOL,ANG),NDA) [<a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1007191#pgen.1007191.ref041" target="_blank">41</a>], represented in (A) with branch lengths estimates corresponding to the posterior means of . (B) The boxplots summarize the distributions of the posterior means of the ESR for each branch in the tree, for the 50 pseudo-replicated datasets. The dotted line indicates the expectation for a balanced ESR (ξi = 0.5). The pie-charts indicate the fraction of significant support values (S < 0.01) against the hypothesis ξ = 0.5 (see <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1007191#pgen.1007191.e028" target="_blank">Eq 4</a>).</p

FigShare

Robustness to violation of the model assumptions.

Author: Florian Clemente (3298086)
Mathieu Gautier (47253)
Renaud Vitalis (376558)
Publication venue
Publication date
Field of study

We simulated four scenarios (A-D) based on a four-population tree with topology ((1,2),(3,4)), as depicted in the inset tree (top). In all scenarios, the root population was made of 50,000 males and 50,000 females, and the internal branches correspond to populations made of 5,000 males and 5,000 females. The two successive splits occurred 2,000 and 4,000 generations before present time. The mutation rate was fixed at μ = 1.5 × 10−7. 50 females per population were sampled for each dataset. In (A) the four external branches were made of females and males, and so a balanced ESR (ξi = 0.5) was assumed throughout the tree (“control” scenario). In (B), we simulated an instantaneous 5-fold population growth in branch 1 and an instantaneous 5-fold bottleneck in branch 4, both events having occurred 400 generations before present. In (C), we simulated migration between population 1 and 2, with equal rates for both sexes: mf = mm = 0.00025 (therefore ). In (D), we simulated female-biased migration between populations 1 and 2 with mf = 0.00025 and mm = 0 (therefore and ). We analyzed 50 replicate simulated datasets for each scenario, with 5,000 autosomal SNPs and 5,000 X-linked SNPs. The boxplots in (A-D) summarize the distributions of the 50 posterior means of ξi for each of the six branches. The horizontal dashed line indicates the true (simulated) values of ξi. The pie-charts indicate the fraction of significant support values (S < 0.01), against the hypothesis ξ = 0.5 (see <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1007191#pgen.1007191.e028" target="_blank">Eq 4</a>).</p

FigShare

Supplemental Material for Hivert et al., 2018

Author: Eric J. Petit (255694)
Mathieu Gautier (47253)
Raphaël Leblois (164597)
Renaud Vitalis (376558)
Valentin Hivert (4960195)
Publication venue
Publication date
Field of study

File S1 contains the detailed mathematical derivations of the model; Table S1 provides a comparison of pairwise FST estimates; Table S2 shows the effect of unequal sampling on pairwise FST estimates; Table S3 shows the effect of variable coverage on pairwise FST estimates; Figure S1 shows pairwise estimators of FST; Figure S2 shows the precision and accuracy of our estimator of FST as a function of pool size and coverage, with varying experimental error rate; Figure S3 shows the precision and accuracy of naive estimators of FST for Pool-seq data; Figure S4 shows the precision and accuracy of alternative estimators of FST with varying pool size, for various levels of differentiation.<br

FigShare

Estimates of locus-specific effects αi, from BayeScan analyses, for each outlier locus in all the inter-host comparisons where it was detected as an outlier (in China and in France).

Author: Denis Bourguet (17212)
Hermine Alexandre (433328)
Philippe Audiot (75413)
Renaud Vitalis (376558)
Réjane Streiff (140007)
Sandrine Cros-Arteil (433329)
Sergine Ponsard (17209)
Publication venue
Publication date
Field of study

The average of αi over all these pairwise comparisons is also provided. These values are a proxy for the nature and strength of selection: positive αi values suggest divergent selection while negative values suggest balancing selection. Population codes are defined in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0069211#pone-0069211-t001" target="_blank">Table? 1</a>. Marker names are underlined (not underlined) for outliers detected at a 5% FDR (10%) threshold.</p

FigShare

Distributions of FST estimates between populations sampled on different host plants, across all AFLP markers in France (A, including all pairs of ECB and ABB populations) and in China (B, including all pairs of ACB and ABB populations).

Author: Denis Bourguet (17212)
Hermine Alexandre (433328)
Philippe Audiot (75413)
Renaud Vitalis (376558)
Réjane Streiff (140007)
Sandrine Cros-Arteil (433329)
Sergine Ponsard (17209)
Publication venue
Publication date
Field of study

Mean values of the distribution are 0.042 and 0.063, respectively, as indicated by the vertical dashed lines. Both distributions are highly leptokurtic (i.e. with kurtosis>3) and significantly different from one another (Kolmogorov-Smirnov test, D = 0.089, P<10−5). Higher kurtosis is observed for the ECB/ABB FST distribution (kurtosis = 12.16) than for the ACB/ABB FST distribution (kurtosis = 11.46).</p

FigShare

Performance of the model for estimating branch-specific sex ratios.

Directed acyclic graph (DAG) of the hierarchical Bayesian model for a three-population example tree.

Application example on human (HapMap) data.

Performance of the model for estimating branch-specific sex ratios in a four-population tree.

Application example on whole-genome human sequence data.

Application example on cattle data.

Robustness to violation of the model assumptions.

Supplemental Material for Hivert et al., 2018

Estimates of locus-specific effects <i>α<sub>i</sub>,</i> from BayeScan analyses, for each outlier locus in all the inter-host comparisons where it was detected as an outlier (in China and in France).

Distributions of <i>F</i><sub>ST</sub> estimates between populations sampled on different host plants, across all AFLP markers in France (A, including all pairs of ECB and ABB populations) and in China (B, including all pairs of ACB and ABB populations).