155 research outputs found

    Probabilistic Graphical Model Representation in Phylogenetics

    Get PDF
    Recent years have seen a rapid expansion of the model space explored in statistical phylogenetics, emphasizing the need for new approaches to statistical model representation and software development. Clear communication and representation of the chosen model is crucial for: (1) reproducibility of an analysis, (2) model development and (3) software design. Moreover, a unified, clear and understandable framework for model representation lowers the barrier for beginners and non-specialists to grasp complex phylogenetic models, including their assumptions and parameter/variable dependencies. Graphical modeling is a unifying framework that has gained in popularity in the statistical literature in recent years. The core idea is to break complex models into conditionally independent distributions. The strength lies in the comprehensibility, flexibility, and adaptability of this formalism, and the large body of computational work based on it. Graphical models are well-suited to teach statistical models, to facilitate communication among phylogeneticists and in the development of generic software for simulation and statistical inference. Here, we provide an introduction to graphical models for phylogeneticists and extend the standard graphical model representation to the realm of phylogenetics. We introduce a new graphical model component, tree plates, to capture the changing structure of the subgraph corresponding to a phylogenetic tree. We describe a range of phylogenetic models using the graphical model framework and introduce modules to simplify the representation of standard components in large and complex models. Phylogenetic model graphs can be readily used in simulation, maximum likelihood inference, and Bayesian inference using, for example, Metropolis-Hastings or Gibbs sampling of the posterior distribution

    A Bayesian framework for the analysis of cospeciation.

    Get PDF
    Abstract. Information on the history of cospeciation and host switching for a group of host and parasite species is contained in the DNA sequences sampled from each. Here, we develop a Bayesian framework for the analysis of cospeciation. We suggest a simple model of host switching by a parasite on a host phylogeny in which host switching events are assumed to occur at a constant rate over the entire evolutionary history of associated hosts and parasites. The posterior probability density of the parameters of the model of host switching are evaluated numerically using Markov chain Monte Carlo. In particular, the method generates the probability density of the number of host switches and of the host switching rate. Moreover, the method provides information on the probability that an event of host switching is associated with a particular pair of branches. A Bayesian approach has several advantages over other methods for the analysis of cospeciation. In particular, it does not assume that the host or parasite phylogenies are known without error; many alternative phylogenies are sampled in proportion to their probability of being correct

    MrBayes 3.2: Efficient Bayesian Phylogenetic Inference and Model Choice Across a Large Model Space

    Get PDF
    Since its introduction in 2001, MrBayes has grown in popularity as a software package for Bayesian phylogenetic inference using Markov chain Monte Carlo (MCMC) methods. With this note, we announce the release of version 3.2, a major upgrade to the latest official release presented in 2003. The new version provides convergence diagnostics and allows multiple analyses to be run in parallel with convergence progress monitored on the fly. The introduction of new proposals and automatic optimization of tuning parameters has improved convergence for many problems. The new version also sports significantly faster likelihood calculations through streaming single-instruction-multiple-data extensions (SSE) and support of the BEAGLE library, allowing likelihood calculations to be delegated to graphics processing units (GPUs) on compatible hardware. Speedup factors range from around 2 with SSE code to more than 50 with BEAGLE for codon problems. Checkpointing across all models allows long runs to be completed even when an analysis is prematurely terminated. New models include relaxed clocks, dating, model averaging across time-reversible substitution models, and support for hard, negative, and partial (backbone) tree constraints. Inference of species trees from gene trees is supported by full incorporation of the Bayesian estimation of species trees (BEST) algorithms. Marginal model likelihoods for Bayes factor tests can be estimated accurately across the entire model space using the stepping stone method. The new version provides more output options than previously, including samples of ancestral states, site rates, site dN/dS rations, branch rates, and node dates. A wide range of statistics on tree parameters can also be output for visualization in FigTree and compatible software

    BEAGLE: An Application Programming Interface and High-Performance Computing Library for Statistical Phylogenetics

    Get PDF
    Phylogenetic inference is fundamental to our understanding of most aspects of the origin and evolution of life, and in recent years, there has been a concentration of interest in statistical approaches such as Bayesian inference and maximum likelihood estimation. Yet, for large data sets and realistic or interesting models of evolution, these approaches remain computationally demanding. High-throughput sequencing can yield data for thousands of taxa, but scaling to such problems using serial computing often necessitates the use of nonstatistical or approximate approaches. The recent emergence of graphics processing units (GPUs) provides an opportunity to leverage their excellent floating-point computational performance to accelerate statistical phylogenetic inference. A specialized library for phylogenetic calculation would allow existing software packages to make more effective use of available computer hardware, including GPUs. Adoption of a common library would also make it easier for other emerging computing architectures, such as field programmable gate arrays, to be used in the future. We present BEAGLE, an application programming interface (API) and library for high-performance statistical phylogenetic inference. The API provides a uniform interface for performing phylogenetic likelihood calculations on a variety of compute hardware platforms. The library includes a set of efficient implementations and can currently exploit hardware including GPUs using NVIDIA CUDA, central processing units (CPUs) with Streaming SIMD Extensions and related processor supplementary instruction sets, and multicore CPUs via OpenMP. To demonstrate the advantages of a common API, we have incorporated the library into several popular phylogenetic software packages. The BEAGLE library is free open source software licensed under the Lesser GPL and available from http://beagle-lib.googlecode.com. An example client program is available as public domain software.This work was supported by the National Science Foundation [grant numbers DBI-0755048, DEB-0732920, DEB-1036448, DMS-0931642, EF-0331495, EF-0905606, EF-0949453]; the National Institutes of Health [grant numbers R01-HG006139, R01-GM037841, R01-GM078985, R01-GM086887, R01-NS063897]; the Biotechnology and Biological Sciences Research Council [grant number BB/H011285/1]; the Wellcome Trust [grant number WT092807MA]; and Google Summer of Code

    BEAGLE: An Application Programming Interface and High-Performance Computing Library for Statistical Phylogenetics

    Get PDF
    Phylogenetic inference is fundamental to our understanding of most aspects of the origin and evolution of life, and in recent years, there has been a concentration of interest in statistical approaches such as Bayesian inference and maximum likelihood estimation. Yet, for large data sets and realistic or interesting models of evolution, these approaches remain computationally demanding. High-throughput sequencing can yield data for thousands of taxa, but scaling to such problems using serial computing often necessitates the use of nonstatistical or approximate approaches. The recent emergence of graphics processing units (GPUs) provides an opportunity to leverage their excellent floating-point computational performance to accelerate statistical phylogenetic inference. A specialized library for phylogenetic calculation would allow existing software packages to make more effective use of available computer hardware, including GPUs. Adoption of a common library would also make it easier for other emerging computing architectures, such as field programmable gate arrays, to be used in the future. We present BEAGLE, an application programming interface (API) and library for high-performance statistical phylogenetic inference. The API provides a uniform interface for performing phylogenetic likelihood calculations on a variety of compute hardware platforms. The library includes a set of efficient implementations and can currently exploit hardware including GPUs using NVIDIA CUDA, central processing units (CPUs) with Streaming SIMD Extensions and related processor supplementary instruction sets, and multicore CPUs via OpenMP. To demonstrate the advantages of a common API, we have incorporated the library into several popular phylogenetic software packages. The BEAGLE library is free open source software licensed under the Lesser GPL and available from http://beagle-lib.googlecode.com. An example client program is available as public domain software

    BEAGLE 3:Improved Performance, Scaling, and Usability for a High-Performance Computing Library for Statistical Phylogenetics

    Get PDF
    © 2019 The Author(s). BEAGLE is a high-performance likelihood-calculation library for phylogenetic inference. The BEAGLE library defines a simple, but flexible, application programming interface (API), and includes a collection of efficient implementations for calculation under a variety of evolutionary models on different hardware devices. The library has been integrated into recent versions of popular phylogenetics software packages including BEAST and MrBayes and has been widely used across a diverse range of evolutionary studies. Here, we present BEAGLE 3 with new parallel implementations, increased performance for challenging data sets, improved scalability, and better usability. We have added new OpenCL and central processing unit-threaded implementations to the library, allowing the effective utilization of a wider range of modern hardware. Further, we have extended the API and library to support concurrent computation of independent partial likelihood arrays, for increased performance of nucleotide-model analyses with greater flexibility of data partitioning. For better scalability and usability, we have improved how phylogenetic software packages use BEAGLE in multi-GPU (graphics processing unit) and cluster environments, and introduced an automated method to select the fastest device given the data set, evolutionary model, and hardware. For application developers who wish to integrate the library, we also have developed an online tutorial. To evaluate the effect of the improvements, we ran a variety of benchmarks on state-of-the-art hardware. For a partitioned exemplar analysis, we observe run-time performance improvements as high as 5.9-fold over our previous GPU implementation. BEAGLE 3 is free, open-source software licensed under the Lesser GPL and available at https://beagle-dev.github.io

    Cryptic Haploid Stages in the Life Cycle of Leathesia marina (Chordariaceae, Phaeophyceae) Under In Vitro Culture

    Get PDF
    We evaluated the life cycle of Leathesia marina through molecular analyses, culture studies, morphological observations, and ploidy measurements. Macroscopic sporophytes were collected from two localities in Atlantic Patagonia and were cultured under long-day (LD) and short-day (SD) conditions. Molecular identification of the microscopic and macroscopic phases was performed through the cox3 and rbcL genes and the phylogeny was assessed on the basis of single gene and concatenated datasets. Nuclear ploidy of each phase was estimated from the DNA contents of individual nuclei through epifluorescence microscopy and flow cytometry. Molecular results confirmed the identity of the Argentinian specimens as L. marina and revealed their conspecificity with L. marina from New Zealand, Germany, and Japan. The sporophytic macrothalli (2n) released mitospores from plurilocular sporangia, which developed into globular microthalli (2n), morphologically similar to the sporophytes but not in size, constituting a generation of small diploid thalli, with a mean fluorescent nuclei cross-sectional area of 3.21 ± 0.7 Όm2. The unilocular sporangia released meiospores that developed two morphologically different types of microthalli: erect branched microthalli (n) with a nuclear area of 1.48 ± 0.07 ”m2 that reproduces asexually, and prostrate branched microthalli (n) with a nuclear area of 1.24 ± 0.10 ”m2 that reproduces sexually. The prostrate microthalli released gametes in LD conditions, which merged and produced macroscopic thalli with a nuclear cross-sectional area of 3.45 ± 0.09 ”m2. Flow cytometry confirmed that the erect and prostrate microthalli were haploid and that the globular microthalli and macrothalli were diploid.Fil: Poza, Ailen Melisa. Consejo Nacional de Investigaciones CientĂ­ficas y TĂ©cnicas. Centro CientĂ­fico TecnolĂłgico Conicet - BahĂ­a Blanca. Instituto Argentino de OceanografĂ­a. Universidad Nacional del Sur. Instituto Argentino de OceanografĂ­a; ArgentinaFil: Santiañez, Wilfred John E.. Hokkaido University; JapĂłn. University of the Philippines Diliman; FilipinasFil: Croce, Maria Emilia. Consejo Nacional de Investigaciones CientĂ­ficas y TĂ©cnicas. Centro CientĂ­fico TecnolĂłgico Conicet - BahĂ­a Blanca. Instituto Argentino de OceanografĂ­a. Universidad Nacional del Sur. Instituto Argentino de OceanografĂ­a; Argentina. Universidad Nacional del Sur. Departamento de BiologĂ­a, BioquĂ­mica y Farmacia; ArgentinaFil: Gauna, Maria Cecilia. Consejo Nacional de Investigaciones CientĂ­ficas y TĂ©cnicas. Centro CientĂ­fico TecnolĂłgico Conicet - BahĂ­a Blanca. Instituto Argentino de OceanografĂ­a. Universidad Nacional del Sur. Instituto Argentino de OceanografĂ­a; Argentina. Universidad Nacional del Sur. Departamento de BiologĂ­a, BioquĂ­mica y Farmacia; ArgentinaFil: Kogame, Kazuhiro. Hokkaido University; JapĂłnFil: Parodi, Elisa Rosalia. Consejo Nacional de Investigaciones CientĂ­ficas y TĂ©cnicas. Centro CientĂ­fico TecnolĂłgico Conicet - BahĂ­a Blanca. Instituto Argentino de OceanografĂ­a. Universidad Nacional del Sur. Instituto Argentino de OceanografĂ­a; Argentin
    • 

    corecore