708 research outputs found

    Inferring introduction routes of invasive species using approximate Bayesian computation on microsatellite data

    Get PDF
    Determining the routes of introduction provides not only information about the history of an invasion process, but also information about the origin and construction of the genetic composition of the invading population. It remains difficult, however, to infer introduction routes from molecular data because of a lack of appropriate methods. We evaluate here the use of an approximate Bayesian computation (ABC) method for estimating the probabilities of introduction routes of invasive populations based on microsatellite data. We considered the crucial case of a single source population from which two invasive populations originated either serially from a single introduction event or from two independent introduction events. Using simulated datasets, we found that the method gave correct inferences and was robust to many erroneous beliefs. The method was also more efficient than traditional methods based on raw values of statistics such as assignment likelihood or pairwise F(ST). We illustrate some of the features of our ABC method, using real microsatellite datasets obtained for invasive populations of the western corn rootworm, Diabrotica virgifera virgifera. Most computations were performed with the DIYABC program (http://www1.montpellier.inra.fr/CBGP/diyabc/)

    msBayes: Pipeline for testing comparative phylogeographic histories using hierarchical approximate Bayesian computation

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Although testing for simultaneous divergence (vicariance) across different population-pairs that span the same barrier to gene flow is of central importance to evolutionary biology, researchers often equate the gene tree and population/species tree thereby ignoring stochastic coalescent variance in their conclusions of temporal incongruence. In contrast to other available phylogeographic software packages, msBayes is the only one that analyses data from multiple species/population pairs under a hierarchical model.</p> <p>Results</p> <p>msBayes employs approximate Bayesian computation (ABC) under a hierarchical coalescent model to test for simultaneous divergence (TSD) in multiple co-distributed population-pairs. Simultaneous isolation is tested by estimating three hyper-parameters that characterize the degree of variability in divergence times across co-distributed population pairs while allowing for variation in various within population-pair demographic parameters (sub-parameters) that can affect the coalescent. msBayes is a software package consisting of several C and R programs that are run with a Perl "front-end".</p> <p>Conclusion</p> <p>The method reasonably distinguishes simultaneous isolation from temporal incongruence in the divergence of co-distributed population pairs, even with sparse sampling of individuals. Because the estimate step is decoupled from the simulation step, one can rapidly evaluate different ABC acceptance/rejection conditions and the choice of summary statistics. Given the complex and idiosyncratic nature of testing multi-species biogeographic hypotheses, we envision msBayes as a powerful and flexible tool for tackling a wide array of difficult research questions that use population genetic data from multiple co-distributed species. The msBayes pipeline is available for download at <url>http://msbayes.sourceforge.net/</url> under an open source license (GNU Public License). The msBayes pipeline is comprised of several C and R programs that are run with a Perl "front-end" and runs on Linux, Mac OS-X, and most POSIX systems. Although the current implementation is for a single locus per species-pair, future implementations will allow analysis of multi-loci data per species pair.</p

    Secondary contact and admixture between independently invading populations of the Western corn rootworm, diabrotica virgifera virgifera in Europe

    Get PDF
    The western corn rootworm, Diabrotica virgifera virgifera (Coleoptera: Chrysomelidae), is one of the most destructive pests of corn in North America and is currently invading Europe. The two major invasive outbreaks of rootworm in Europe have occurred, in North-West Italy and in Central and South-Eastern Europe. These two outbreaks originated from independent introductions from North America. Secondary contact probably occurred in North Italy between these two outbreaks, in 2008. We used 13 microsatellite markers to conduct a population genetics study, to demonstrate that this geographic contact resulted in a zone of admixture in the Italian region of Veneto. We show that i) genetic variation is greater in the contact zone than in the parental outbreaks; ii) several signs of admixture were detected in some Venetian samples, in a Bayesian analysis of the population structure and in an approximate Bayesian computation analysis of historical scenarios and, finally, iii) allelic frequency clines were observed at microsatellite loci. The contact between the invasive outbreaks in North-West Italy and Central and South-Eastern Europe resulted in a zone of admixture, with particular characteristics. The evolutionary implications of the existence of a zone of admixture in Northern Italy and their possible impact on the invasion success of the western corn rootworm are discussed

    A global optimisation approach to range-restricted survey calibration

    Get PDF
    Survey calibration methods modify minimally unit-level sample weights to fit domain-level benchmark constraints (BC). This allows exploitation of auxiliary information, e.g. census totals, to improve the representativeness of sample data (addressing coverage limitations, non-response) and the quality of estimates of population parameters. Calibration methods may fail with samples presenting small/zero counts for some benchmark groups or when range restrictions (RR), such as positivity, are imposed to avoid unrealistic or extreme weights. User-defined modifications of BC/RR performed after encountering non-convergence allow little control on the solution, and penalization approaches modelling infeasibility may not guarantee convergence. Paradoxically, this has led to underuse in calibration of highly disaggregated information, when available. We present an always-convergent flexible two-step Global Optimisation (GO) survey calibration approach. The feasibility of the calibration problem is assessed, and automatically controlled minimum errors in BC or changes in RR are allowed to guarantee convergence in advance, while preserving the good properties of calibration estimators. Modelling alternatives under different scenarios, using various error/change and distance measures are formulated and discussed. The GO approach is validated by calibrating the weights of the 2012 Health Survey for England to a fine age-gender-region cross-tabulation (378 counts) from the 2011 Census in England and Wales

    A global optimisation approach to range-restricted survey calibration

    Get PDF
    Survey calibration methods modify minimally unit-level sample weights to fit domain-level benchmark constraints (BC). This allows exploitation of auxiliary information, e.g. census totals, to improve the representativeness of sample data (addressing coverage limitations, non-response) and the quality of estimates of population parameters. Calibration methods may fail with samples presenting small/zero counts for some benchmark groups or when range restrictions (RR), such as positivity, are imposed to avoid unrealistic or extreme weights. User-defined modifications of BC/RR performed after encountering non-convergence allow little control on the solution, and penalization approaches modelling infeasibility may not guarantee convergence. Paradoxically, this has led to underuse in calibration of highly disaggregated information, when available. We present an always-convergent flexible two-step Global Optimisation (GO) survey calibration approach. The feasibility of the calibration problem is assessed, and automatically controlled minimum errors in BC or changes in RR are allowed to guarantee convergence in advance, while preserving the good properties of calibration estimators. Modelling alternatives under different scenarios, using various error/change and distance measures are formulated and discussed. The GO approach is validated by calibrating the weights of the 2012 Health Survey for England to a fine age-gender-region cross-tabulation (378 counts) from the 2011 Census in England and Wales

    Amount of Information Needed for Model Choice in Approximate Bayesian Computation

    Get PDF
    Approximate Bayesian Computation (ABC) has become a popular technique in evolutionary genetics for elucidating population structure and history due to its flexibility. The statistical inference framework has benefited from significant progress in recent years. In population genetics, however, its outcome depends heavily on the amount of information in the dataset, whether that be the level of genetic variation or the number of samples and loci. Here we look at the power to reject a simple constant population size coalescent model in favor of a bottleneck model in datasets of varying quality. Not only is this power dependent on the number of samples and loci, but it also depends strongly on the level of nucleotide diversity in the observed dataset. Whilst overall model choice in an ABC setting is fairly powerful and quite conservative with regard to false positives, detecting weaker bottlenecks is problematic in smaller or less genetically diverse datasets and limits the inferences possible in non-model organism where the amount of information regarding the two models is often limited. Our results show it is important to consider these limitations when performing an ABC analysis and that studies should perform simulations based on the size and nature of the dataset in order to fully assess the power of the study

    Controlling Population Evolution in the Laboratory to Evaluate Methods of Historical Inference

    Get PDF
    Natural populations of known detailed past demographic history are extremely valuable to evaluate methods of historical inference, yet are extremely rare. As an alternative approach, we have generated multiple replicate microsatellite data sets from laboratory-cultured populations of a gonochoric free-living nematode, Caenorhabditis remanei, that were constrained to pre-defined demographic histories featuring different levels of migration among populations or bottleneck events of different magnitudes. These data sets were then used to evaluate the performances of two recently developed population genetics methods, BayesAss+, that estimates recent migration rates among populations, and Bottleneck, that detects the occurrence of recent bottlenecks. Migration rates inferred by BayesAss+ were generally over-estimates, although these were often included within the confidence interval. Analyses of data sets simulated in-silico, using a model mimicking the laboratory experiments, produced less biased estimates of the migration rates, and showed increased efficiency of the program when the number of loci and sampled genotypes per population was higher. In the replicates for which the pre-bottleneck laboratory-cultured populations did not significantly depart from a mutation/drift equilibrium, an important assumption of the program Bottleneck, only a portion of the bottleneck events were detected. This result was confirmed by in-silico simulations mirroring the laboratory bottleneck experiments. More generally, our study demonstrates the feasibility, and highlights some of the limits, of the approach that consists in generating molecular genetic data sets by controlling the evolution of laboratory-reared nematode populations, for the purpose of validating methods inferring population history

    Recent advances in real geometric reasoning

    Get PDF
    In the 1930s Tarski showed that real quantifier elimination was possible, and in 1975 Collins gave a remotely practicable method, albeit with doubly-exponential complexity, which was later shown to be inherent. We discuss some of the recent major advances in Collins method: such as an alternative approach based on passing via the complexes, and advances which come closer to "solving the question asked" rather than "solving all problems to do with these polynomials"

    Germline MC1R status influences somatic mutation burden in melanoma

    Get PDF
    The major genetic determinants of cutaneous melanoma risk in the general population are disruptive variants (R alleles) in the melanocortin 1 receptor (MC1R) gene. These alleles are also linked to red hair, freckling, and sun sensitivity, all of which are known melanoma phenotypic risk factors. Here we report that in melanomas and for somatic C>T mutations, a signature linked to sun exposure, the expected single-nucleotide variant count associated with the presence of an R allele is estimated to be 42% (95% CI, 15-76%) higher than that among persons without an R allele. This figure is comparable to the expected mutational burden associated with an additional 21 years of age. We also find significant and similar enrichment of non-C>T mutation classes supporting a role for additional mutagenic processes in melanoma development in individuals carrying R alleles

    Composite likelihood estimation of demographic parameters

    Get PDF
    which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Background: Most existing likelihood-based methods for fitting historical demographic models to DNA sequence polymorphism data to do not scale feasibly up to the level of whole-genome data sets. Computational economies can be achieved by incorporating two forms of pseudo-likelihood: composite and approximate likelihood methods. Composite likelihood enables scaling up to large data sets because it takes the product of marginal likelihoods as an estimator of the likelihood of the complete data set. This approach is especially useful when a large number of genomic regions constitutes the data set. Additionally, approximate likelihood methods can reduce the dimensionality of the data by summarizing the information in the original data by either a sufficient statistic, or a set of statistics. Both composite and approximate likelihood methods hold promise for analyzing large data sets or for use in situations where the underlying demographic model is complex and has many parameters. This paper considers a simple demographic model of allopatric divergence between two populations, in which one of the population is hypothesized to have experienced a founder event, or population bottleneck. A large resequencing data set from human populations is summarized by the joint frequency spectrum, which is a matrix of the genomic frequency spectrum of derived base frequencies in two populations. A Bayesia
    • …
    corecore