48 research outputs found

    FossilSim:An r package for simulating fossil occurrence data under mechanistic models of preservation and recovery

    Get PDF
    1.Key features of the fossil record that present challenges for integrating palaeontological and phylogenetic datasets include (i) non‐uniform fossil recovery, (ii) stratigraphic age uncertainty and (iii) inconsistencies in the definition of species origination and taxonomy. 2.We present an r package FossilSim that can be used to simulate and visualise fossil data for phylogenetic analysis under a range of flexible models. The package includes interval‐, environment‐ and lineage‐dependent models of fossil recovery that can be combined with models of stratigraphic age uncertainty and species evolution. 3.The package input and output can be used in combination with the wide range of existing phylogenetic and palaeontological r packages. We also provide functions for converting between FossilSim and paleotree objects. 4. Simulated datasets provide enormous potential to assess the performance of phylogenetic methods and to explore the impact of using fossil occurrence databases on parameter estimation in macroevolution.ISSN:2041-210XISSN:2041-209

    Taming the BEAST—A Community Teaching Material Resource for BEAST 2

    Get PDF
    Phylogenetics and phylodynamics are central topics in modern evolutionary biology. Phylogenetic methods reconstruct the evolutionary relationships among organisms, whereas phylodynamic approaches reveal the underlying diversification processes that lead to the observed relationships. These two fields have many practical applications in disciplines as diverse as epidemiology, developmental biology, palaeontology, ecology, and linguistics. The combination of increasingly large genetic data sets and increases in computing power is facilitating the development of more sophisticated phylogenetic and phylodynamic methods. Big data sets allow us to answer complex questions. However, since the required analyses are highly specific to the particular data set and question, a black-box method is not sufficient anymore. Instead, biologists are required to be actively involved with modeling decisions during data analysis. The modular design of the Bayesian phylogenetic software package BEAST 2 enables, and in fact enforces, this involvement. At the same time, the modular design enables computational biology groups to develop new methods at a rapid rate. A thorough understanding of the models and algorithms used by inference software is a critical prerequisite for successful hypothesis formulation and assessment. In particular, there is a need for more readily available resources aimed at helping interested scientists equip themselves with the skills to confidently use cutting-edge phylogenetic analysis software. These resources will also benefit researchers who do not have access to similar courses or training at their home institutions. Here, we introduce the “Taming the Beast” (https://taming-the-beast.github.io/) resource, which was developed as part of a workshop series bearing the same name, to facilitate the usage of the Bayesian phylogenetic software package BEAST 2

    Meta-analysis of northeast Atlantic marine taxa shows contrasting phylogeographic patterns following post-LGM expansions

    Get PDF
    Background. Comparative phylogeography enables the study of historical and evolutionary processes that have contributed to shaping patterns of contemporary genetic diversity across co-distributed species. In this study, we explored genetic structure and historical demography in a range of coastal marine species across the northeast Atlantic to assess whether there are commonalities in phylogeographic patterns across taxa and to evaluate whether the timings of population expansions were linked to the Last Glacial Maximum (LGM). Methods. A literature search was conducted using Web of Science. Search terms were chosen to maximise the inclusion of articles reporting on population structure and phylogeography from the northeast Atlantic; titles and abstracts were screened to identify suitable articles within the scope of this study. Given the proven utility of mtDNA in comparative phylogeography and the availability of these data in the public domain, a meta-analysis was conducted using published mtDNA gene sequences. A standardised methodology was implemented to ensure that the genealogy and demographic history of all mtDNA datasets were reanalysed in a consistent and directly comparable manner. Results. Mitochondrial DNA datasets were built for 21 species. The meta-analysis revealed significant population differentiation in 16 species and four main types of haplotype network were found, with haplotypes in some species unique to specific geographical locations. A signal of rapid expansion was detected in 16 species, whereas five species showed evidence of a stable population size. Corrected mutation rates indicated that the majority of expansions were estimated to have occurred after the earliest estimate for the LGM (similar to 26.5 Kyr), while few expansions were estimated to have pre-dated the LGM. Conclusion. This study suggests that post-LGM expansion appeared to be common in a range of marine taxa, supporting the concept of rapid expansions after the LGM as the ice sheets started to retreat. However, despite the commonality of expansion patterns in many of these taxa, phylogeographic patterns appear to differ in the species included in this study. This suggests that species-specific evolutionary processes, as well as historical events, have likely influenced the distribution of genetic diversity of marine taxa in the northeast Atlantic

    BEAST 2.5:An advanced software platform for Bayesian evolutionary analysis

    Get PDF
    Elaboration of Bayesian phylogenetic inference methods has continued at pace in recent years with major new advances in nearly all aspects of the joint modelling of evolutionary data. It is increasingly appreciated that some evolutionary questions can only be adequately answered by combining evidence from multiple independent sources of data, including genome sequences, sampling dates, phenotypic data, radiocarbon dates, fossil occurrences, and biogeographic range information among others. Including all relevant data into a single joint model is very challenging both conceptually and computationally. Advanced computational software packages that allow robust development of compatible (sub-)models which can be composed into a full model hierarchy have played a key role in these developments. Developing such software frameworks is increasingly a major scientific activity in its own right, and comes with specific challenges, from practical software design, development and engineering challenges to statistical and conceptual modelling challenges. BEAST 2 is one such computational software platform, and was first announced over 4 years ago. Here we describe a series of major new developments in the BEAST 2 core platform and model hierarchy that have occurred since the first release of the software, culminating in the recent 2.5 release

    Complex birth-death models for Bayesian phylodynamic inferences

    No full text
    Phylogenetic trees show the evolutionary relationships between individuals, populations or species and are generally built from genetic sequences. Phylodynamic inference focuses on reconstructing the underlying evolutionary processes from a phylogenetic tree, and can infer biologically meaningful parameters such as the rate of transmission of a pathogen or the rate of extinction of certain species. Its applications thus range from tracking the spread of epidemics to evaluating the impact of environmental conditions on the diversification process. Birth-death models are one of the main categories of models used for phylodynamic inference. This thesis presents work realized on two important types of birth-death models, the multi-state model for structured populations and the fossilized birth-death process. Chapter 1 presents an overview of Bayesian phylodynamic inference and its applications as well as birth-death models. In Chapter 2, I introduce a new multi-state birth-death (MSBD) model which can be used to study variations in birth and death rates across a phylogenetic tree. I show that this model can reliably infer these rates on both simulated and empirical datasets. Chapter 3 shows an application of the MSBD model to the detection of transmission clusters in HIV transmission networks, for which I show that it performs better than existing cutpoint-based methods. Chapter 4 presents an R package for simulating fossil and taxonomy datasets, which can be used to test and validate existing or future birth-death models integrating fossils. An application of this package is shown in Chapter 5, where I compare several different methods of handling fossil age uncertainty and evaluate their impact on the accuracy of the estimates. In particular, I show that commonly used methods of simplifying the data by disregarding the age uncertainty lead to strong biases in the resulting inference. In Chapter 6, I present a series of workshops and an online knowledge repository I have contributed to, which are designed to help users of Bayesian phylodynamic inference via the software BEAST2 make the best choices for their own datasets. Indeed, as more complex models are developed, communication between users and developers is increasingly crucial. Finally, in Chapter 7, I discuss the methods developed in this thesis and suggest directions for future research

    A Multitype Birth-Death Model for Bayesian Inference of Lineage-Specific Birth and Death Rates

    No full text
    Heterogeneous populations can lead to important differences in birth and death rates across a phylogeny. Taking this heterogeneity into account is necessary to obtain accurate estimates of the underlying population dynamics. We present a new multi-type birth-death model (MTBD) that can estimate lineage-specific birth and death rates. This corresponds to estimating lineage-dependent speciation and extinction rates for species phylogenies, and lineage-dependent transmission and recovery rates for pathogen transmission trees. In contrast with previous models, we do not presume to know the trait driving the rate differences, nor do we prohibit the same rates from appearing in different parts of the phylogeny. Using simulated datasets, we show that the MTBD model can reliably infer the presence of multiple evolutionary regimes, their positions in the tree, and the birth and death rates associated with each. We also present a re-analysis of two empirical datasets and compare the results obtained by MTBD and by the existing software BAMM. We compare two implementations of the model, one exact and one approximate (assuming that no rate changes occur in the extinct parts of the tree), and show that the approximation only slightly affects results. The MTBD model is implemented as a package in the Bayesian inference software BEAST~2, and allows joint inference of the phylogeny and the model parameters.Files contained in this dataset: Supplement.pdf: Supplementary methods and results code_files.zip: R scripts used to simulate, process and analyze the data and make the plots; XML configuration files used to run BEAST2 data_files.zip: Simulated datasets and summary of the result

    Detection of HIV transmission clusters from phylogenetic trees using a multi-state birth–death model

    No full text
    HIV patients form clusters in HIV transmission networks. Accurate identification of these transmission clusters is essential to effectively target public health interventions. One reason for clustering is that the underlying contact network contains many local communities. We present a new maximum-likelihood method for identifying transmission clusters caused by community structure, based on phylogenetic trees. The method employs a multi-state birth–death (MSBD) model which detects changes in transmission rate, which are interpreted as the introduction of the epidemic into a new susceptible community, i.e. the formation of a new cluster. We show that the MSBD method is able to reliably infer the clusters and the transmission parameters from a pathogen phylogeny based on our simulations. In contrast to existing cutpoint-based methods for cluster identification, our method does not require that clusters be monophyletic nor is it dependent on the selection of a difficult-to-interpret cutpoint parameter. We present an application of our method to data from the Swiss HIV Cohort Study. The method is available as an easy-to-use R package.ISSN:1742-5689ISSN:1742-566
    corecore