58 research outputs found
An Improved Approximate-Bayesian Model-choice Method for Estimating Shared Evolutionary History
To understand biological diversification, it is important to account for
large-scale processes that affect the evolutionary history of groups of
co-distributed populations of organisms. Such events predict temporally
clustered divergences times, a pattern that can be estimated using genetic data
from co-distributed species. I introduce a new approximate-Bayesian method for
comparative phylogeographical model-choice that estimates the temporal
distribution of divergences across taxa from multi-locus DNA sequence data. The
model is an extension of that implemented in msBayes. By reparameterizing the
model, introducing more flexible priors on demographic and divergence-time
parameters, and implementing a non-parametric Dirichlet-process prior over
divergence models, I improved the robustness, accuracy, and power of the method
for estimating shared evolutionary history across taxa. The results demonstrate
the improved performance of the new method is due to (1) more appropriate
priors on divergence-time and demographic parameters that avoid prohibitively
small marginal likelihoods for models with more divergence events, and (2) the
Dirichlet-process providing a flexible prior on divergence histories that does
not strongly disfavor models with intermediate numbers of divergence events.
The new method yields more robust estimates of posterior uncertainty, and thus
greatly reduces the tendency to incorrectly estimate models of shared
evolutionary history with strong support.Comment: 48 pages, 8 figures, 4 tables, 35 pages of supporting information
with 1 supporting table and 33 supporting figure
Implications of uniformly distributed, empirically informed priors for phylogeographical model selection: A reply to Hickerson et al
Establishing that a set of population-splitting events occurred at the same
time can be a potentially persuasive argument that a common process affected
the populations. Oaks et al. (2013) assessed the ability of an
approximate-Bayesian method (msBayes) to estimate such a pattern of
simultaneous divergence across taxa, to which Hickerson et al. (2014)
responded. Both papers agree the method is sensitive to prior assumptions and
often erroneously supports shared divergences; the papers differ about the
explanation and solution. Oaks et al. (2013) suggested the method's behavior is
caused by the strong weight of uniform priors on divergence times leading to
smaller marginal likelihoods of models with more divergence-time parameters
(Hypothesis 1); they proposed alternative priors to avoid strongly weighted
posteriors. Hickerson et al. (2014) suggested numerical approximation error
causes msBayes analyses to be biased toward models of clustered divergences
(Hypothesis 2); they proposed using narrow, empirical uniform priors. Here, we
demonstrate that the approach of Hickerson et al. (2014) does not mitigate the
method's tendency to erroneously support models of clustered divergences, and
often excludes the true parameter values. Our results also show that the
tendency of msBayes analyses to support models of shared divergences is
primarily due to Hypothesis 1. This series of papers demonstrate that if our
prior assumptions place too much weight in unlikely regions of parameter space
such that the exact posterior supports the wrong model of evolutionary history,
no amount of computation can rescue our inference. Fortunately, more flexible
distributions that accommodate prior uncertainty about parameters without
placing excessive weight in vast regions of parameter space with low likelihood
increase the method's robustness and power to detect temporal variation in
divergences.Comment: 24 pages, 4 figures, 1 table, 14 pages of supporting information with
10 supporting figure
Marginal likelihoods in phylogenetics: a review of methods and applications
By providing a framework of accounting for the shared ancestry inherent to
all life, phylogenetics is becoming the statistical foundation of biology. The
importance of model choice continues to grow as phylogenetic models continue to
increase in complexity to better capture micro and macroevolutionary processes.
In a Bayesian framework, the marginal likelihood is how data update our prior
beliefs about models, which gives us an intuitive measure of comparing model
fit that is grounded in probability theory. Given the rapid increase in the
number and complexity of phylogenetic models, methods for approximating
marginal likelihoods are increasingly important. Here we try to provide an
intuitive description of marginal likelihoods and why they are important in
Bayesian model testing. We also categorize and review methods for estimating
marginal likelihoods of phylogenetic models, highlighting several recent
methods that provide well-behaved estimates. Furthermore, we review some
empirical studies that demonstrate how marginal likelihoods can be used to
learn about models of evolution from biological data. We discuss promising
alternatives that can complement marginal likelihoods for Bayesian model
choice, including posterior-predictive methods. Using simulations, we find one
alternative method based on approximate-Bayesian computation (ABC) to be
biased. We conclude by discussing the challenges of Bayesian model choice and
future directions that promise to improve the approximation of marginal
likelihoods and Bayesian phylogenetics as a whole.Comment: 33 pages, 3 figure
The why, when, and how of computing in biology classrooms [version 1; peer review: 2 approved]
Many biologists are interested in teaching computing skills or using computing in the classroom, despite not being formally trained in these skills themselves. Thus biologists may find themselves researching how to teach these skills, and therefore many individuals are individually attempting to discover resources and methods to do so. Recent years have seen an expansion of new technologies to assist in delivering course content interactively. Educational research provides insights into how learners absorb and process information during interactive learning. In this review, we discuss the value of teaching foundational computing skills to biologists, and strategies and tools to do so. Additionally, we review the literature on teaching practices to support the development of these skills. We pay special attention to meeting the needs of diverse learners, and consider how different ways of delivering course content can be leveraged to provide a more inclusive classroom experience. Our goal is to enable biologists to teach computational skills and use computing in the classroom successfully
Data from: A time-calibrated species tree of Crocodylia reveals a recent radiation of the true crocodiles
True crocodiles (Crocodylus) are the most broadly distributed, ecologically diverse, and species-rich crocodylian genus, comprising about half of extant crocodylian diversity and exhibiting a circumtropical distribution. Crocodylus traditionally has been viewed as an ancient group of morphologically conserved species that originated in Africa prior to continental breakup. In this study, these long-held notions about the temporal and geographic origin of Crocodylus are tested using DNA sequence data of 10 loci from 76 individuals representing all 23 crocodylian species. I infer a time-calibrated species tree of all Crocodylia and estimate the spatial pattern of diversification within Crocodylus. For the first time, a fully resolved phylogenetic estimate of all Crocodylia is well-supported. The results overturn traditional views of the evolution of Crocodylus by demonstrating that the true crocodiles are not "living-fossils" that originated in Africa. Rather, Crocodylus originated from an ancestor in the tropics of the Late Miocene Indo-Pacific, and rapidly radiated and dispersed around the globe during a period marked by mass extinctions of fellow crocodylians. The findings also reveal more diversity within the genus than is recognized by current taxonomy
oaks2011_crocodylia
Multi-locus alignment of DNA sequences from 79 crocodylians in nexus format. Gene regions are specified as character sets
Project repository
An archived version of the project git repository. The archive is in Zenodo; the original repository is hosted on GitHub
NCBI BioProject
This NCBI BioProject includes the raw (demultiplexed) sequence reads that were used in the paper
Supplemental Table 2
The data for all samples included in the 16 pairs of populations analyzed in this study. This is a subset of the data in Supplemental Table 1
- …