58 research outputs found

    An Improved Approximate-Bayesian Model-choice Method for Estimating Shared Evolutionary History

    Get PDF
    To understand biological diversification, it is important to account for large-scale processes that affect the evolutionary history of groups of co-distributed populations of organisms. Such events predict temporally clustered divergences times, a pattern that can be estimated using genetic data from co-distributed species. I introduce a new approximate-Bayesian method for comparative phylogeographical model-choice that estimates the temporal distribution of divergences across taxa from multi-locus DNA sequence data. The model is an extension of that implemented in msBayes. By reparameterizing the model, introducing more flexible priors on demographic and divergence-time parameters, and implementing a non-parametric Dirichlet-process prior over divergence models, I improved the robustness, accuracy, and power of the method for estimating shared evolutionary history across taxa. The results demonstrate the improved performance of the new method is due to (1) more appropriate priors on divergence-time and demographic parameters that avoid prohibitively small marginal likelihoods for models with more divergence events, and (2) the Dirichlet-process providing a flexible prior on divergence histories that does not strongly disfavor models with intermediate numbers of divergence events. The new method yields more robust estimates of posterior uncertainty, and thus greatly reduces the tendency to incorrectly estimate models of shared evolutionary history with strong support.Comment: 48 pages, 8 figures, 4 tables, 35 pages of supporting information with 1 supporting table and 33 supporting figure

    Implications of uniformly distributed, empirically informed priors for phylogeographical model selection: A reply to Hickerson et al

    Full text link
    Establishing that a set of population-splitting events occurred at the same time can be a potentially persuasive argument that a common process affected the populations. Oaks et al. (2013) assessed the ability of an approximate-Bayesian method (msBayes) to estimate such a pattern of simultaneous divergence across taxa, to which Hickerson et al. (2014) responded. Both papers agree the method is sensitive to prior assumptions and often erroneously supports shared divergences; the papers differ about the explanation and solution. Oaks et al. (2013) suggested the method's behavior is caused by the strong weight of uniform priors on divergence times leading to smaller marginal likelihoods of models with more divergence-time parameters (Hypothesis 1); they proposed alternative priors to avoid strongly weighted posteriors. Hickerson et al. (2014) suggested numerical approximation error causes msBayes analyses to be biased toward models of clustered divergences (Hypothesis 2); they proposed using narrow, empirical uniform priors. Here, we demonstrate that the approach of Hickerson et al. (2014) does not mitigate the method's tendency to erroneously support models of clustered divergences, and often excludes the true parameter values. Our results also show that the tendency of msBayes analyses to support models of shared divergences is primarily due to Hypothesis 1. This series of papers demonstrate that if our prior assumptions place too much weight in unlikely regions of parameter space such that the exact posterior supports the wrong model of evolutionary history, no amount of computation can rescue our inference. Fortunately, more flexible distributions that accommodate prior uncertainty about parameters without placing excessive weight in vast regions of parameter space with low likelihood increase the method's robustness and power to detect temporal variation in divergences.Comment: 24 pages, 4 figures, 1 table, 14 pages of supporting information with 10 supporting figure

    Marginal likelihoods in phylogenetics: a review of methods and applications

    Full text link
    By providing a framework of accounting for the shared ancestry inherent to all life, phylogenetics is becoming the statistical foundation of biology. The importance of model choice continues to grow as phylogenetic models continue to increase in complexity to better capture micro and macroevolutionary processes. In a Bayesian framework, the marginal likelihood is how data update our prior beliefs about models, which gives us an intuitive measure of comparing model fit that is grounded in probability theory. Given the rapid increase in the number and complexity of phylogenetic models, methods for approximating marginal likelihoods are increasingly important. Here we try to provide an intuitive description of marginal likelihoods and why they are important in Bayesian model testing. We also categorize and review methods for estimating marginal likelihoods of phylogenetic models, highlighting several recent methods that provide well-behaved estimates. Furthermore, we review some empirical studies that demonstrate how marginal likelihoods can be used to learn about models of evolution from biological data. We discuss promising alternatives that can complement marginal likelihoods for Bayesian model choice, including posterior-predictive methods. Using simulations, we find one alternative method based on approximate-Bayesian computation (ABC) to be biased. We conclude by discussing the challenges of Bayesian model choice and future directions that promise to improve the approximation of marginal likelihoods and Bayesian phylogenetics as a whole.Comment: 33 pages, 3 figure

    The why, when, and how of computing in biology classrooms [version 1; peer review: 2 approved]

    Get PDF
    Many biologists are interested in teaching computing skills or using computing in the classroom, despite not being formally trained in these skills themselves. Thus biologists may find themselves researching how to teach these skills, and therefore many individuals are individually attempting to discover resources and methods to do so. Recent years have seen an expansion of new technologies to assist in delivering course content interactively. Educational research provides insights into how learners absorb and process information during interactive learning. In this review, we discuss the value of teaching foundational computing skills to biologists, and strategies and tools to do so. Additionally, we review the literature on teaching practices to support the development of these skills. We pay special attention to meeting the needs of diverse learners, and consider how different ways of delivering course content can be leveraged to provide a more inclusive classroom experience. Our goal is to enable biologists to teach computational skills and use computing in the classroom successfully

    Data from: A time-calibrated species tree of Crocodylia reveals a recent radiation of the true crocodiles

    No full text
    True crocodiles (Crocodylus) are the most broadly distributed, ecologically diverse, and species-rich crocodylian genus, comprising about half of extant crocodylian diversity and exhibiting a circumtropical distribution. Crocodylus traditionally has been viewed as an ancient group of morphologically conserved species that originated in Africa prior to continental breakup. In this study, these long-held notions about the temporal and geographic origin of Crocodylus are tested using DNA sequence data of 10 loci from 76 individuals representing all 23 crocodylian species. I infer a time-calibrated species tree of all Crocodylia and estimate the spatial pattern of diversification within Crocodylus. For the first time, a fully resolved phylogenetic estimate of all Crocodylia is well-supported. The results overturn traditional views of the evolution of Crocodylus by demonstrating that the true crocodiles are not "living-fossils" that originated in Africa. Rather, Crocodylus originated from an ancestor in the tropics of the Late Miocene Indo-Pacific, and rapidly radiated and dispersed around the globe during a period marked by mass extinctions of fellow crocodylians. The findings also reveal more diversity within the genus than is recognized by current taxonomy

    oaks2011_crocodylia

    No full text
    Multi-locus alignment of DNA sequences from 79 crocodylians in nexus format. Gene regions are specified as character sets

    Project repository

    No full text
    An archived version of the project git repository. The archive is in Zenodo; the original repository is hosted on GitHub

    NCBI BioProject

    No full text
    This NCBI BioProject includes the raw (demultiplexed) sequence reads that were used in the paper

    Supplemental Table 2

    No full text
    The data for all samples included in the 16 pairs of populations analyzed in this study. This is a subset of the data in Supplemental Table 1
    • …
    corecore