50 research outputs found

    πBUSS:a parallel BEAST/BEAGLE utility for sequence simulation under complex evolutionary scenarios

    Get PDF
    Background: Simulated nucleotide or amino acid sequences are frequently used to assess the performance of phylogenetic reconstruction methods. BEAST, a Bayesian statistical framework that focuses on reconstructing time-calibrated molecular evolutionary processes, supports a wide array of evolutionary models, but lacked matching machinery for simulation of character evolution along phylogenies. Results: We present a flexible Monte Carlo simulation tool, called piBUSS, that employs the BEAGLE high performance library for phylogenetic computations within BEAST to rapidly generate large sequence alignments under complex evolutionary models. piBUSS sports a user-friendly graphical user interface (GUI) that allows combining a rich array of models across an arbitrary number of partitions. A command-line interface mirrors the options available through the GUI and facilitates scripting in large-scale simulation studies. Analogous to BEAST model and analysis setup, more advanced simulation options are supported through an extensible markup language (XML) specification, which in addition to generating sequence output, also allows users to combine simulation and analysis in a single BEAST run. Conclusions: piBUSS offers a unique combination of flexibility and ease-of-use for sequence simulation under realistic evolutionary scenarios. Through different interfaces, piBUSS supports simulation studies ranging from modest endeavors for illustrative purposes to complex and large-scale assessments of evolutionary inference procedures. The software aims at implementing new models and data types that are continuously being developed as part of BEAST/BEAGLE.Comment: 13 pages, 2 figures, 1 tabl

    SPREAD: spatial phylogenetic reconstruction of evolutionary dynamics

    Get PDF
    Summary: SPREAD is a user-friendly, cross-platform application to analyze and visualize Bayesian phylogeographic reconstructions incorporating spatial–temporal diffusion. The software maps phylogenies annotated with both discrete and continuous spatial information and can export high-dimensional posterior summaries to keyhole markup language (KML) for animation of the spatial diffusion through time in virtual globe software. In addition, SPREAD implements Bayes factor calculation to evaluate the support for hypotheses of historical diffusion among pairs of discrete locations based on Bayesian stochastic search variable selection estimates. SPREAD takes advantage of multicore architectures to process large joint posterior distributions of phylogenies and their spatial diffusion and produces visualizations as compelling and interpretable statistical summaries for the different spatial projections

    Host ecology determines the dispersal patterns of a plant virus

    Get PDF
    Since its isolation in 1966 in Kenya, rice yellow mottle virus (RYMV) has been reported throughout Africa resulting in one of the economically most important tropical plant emerging diseases. A thorough understanding of RYMV evolution and dispersal is critical to manage viral spread in tropical areas that heavily rely on agriculture for subsistence. Phylogenetic analyses have suggested a relatively recent expansion, perhaps driven by the intensification of agricultural practices, but this has not yet been examined in a coherent statistical framework. To gain insight into the historical spread of RYMV within Africa rice cultivations, we analyse a dataset of 300 coat protein gene sequences, sampled from East to West Africa over a 46-year period, using Bayesian evolutionary inference. Spatiotemporal reconstructions date the origin of RMYV back to 1852 (1791-1903) and confirm Tanzania as the most likely geographic origin. Following a single long-distance transmission event from East to West Africa, separate viral populations have been maintained for about a century. To identify the factors that shaped the RYMV distribution, we apply a generalised linear model (GLM) extension of discrete phylogenetic diffusion and provide strong support for distances measured on a rice connectivity landscape as the major determinant of RYMV spread. Phylogeographic estimates in continuous space further complement this by demonstrating more pronounced expansion dynamics in West Africa that are consistent with agricultural intensification and extensification. Taken together, our principled phylogeographic inference approach shows for the first time that host ecology dynamics have shaped the historical spread of a plant virus.status: publishe

    SPREAD 4:Online visualisation of pathogen phylogeographic reconstructions

    Get PDF
    Phylogeographic analyses aim to extract information about pathogen spread from genomic data, and visualising spatio-temporal reconstructions is a key aspect of this process. Here we present SPREAD 4, a feature-rich web-based application that visualises estimates of pathogen dispersal resulting from Bayesian phylogeographic inference using BEAST on a geographic map, offering zoom-and-filter functionality and smooth animation over time. SPREAD 4 takes as input phylogenies with both discrete and continuous location annotation and offers customised visualisation as well as generation of publication-ready figures. SPREAD 4 now features account-based storage and easy sharing of visualisations by means of unique web addresses. SPREAD 4 is intuitive to use and is available online at https://spreadviz.org, with an accompanying web page containing answers to frequently asked questions at https://beast.community/spread4

    Inferring heterogeneous evolutionary processes through time:from sequence substitution to phylogeography

    Get PDF
    Molecular phylogenetic and phylogeographic reconstructions generally assume time-homogeneous substitution processes. Motivated by computational convenience, this assumption sacrifices biological realism and offers little opportunity to uncover the temporal dynamics in evolutionary histories. Here, we extend and generalize an evolutionary approach that relaxes the time-homogeneous process assumption by allowing the specification of different infinitesimal substitution rate matrices across different time intervals, called epochs, along the evolutionary history. We focus on an epoch model implementation in a Bayesian inference framework that offers great modeling flexibility in drawing inference about any discrete data type characterized as a continuous-time Markov chain, including phylogeographic traits. To alleviate the computational burden that the additional temporal heterogeneity imposes, we adopt a massively parallel approach that achieves both fine- and coarse-grain parallelization of the computations across branches that accommodate epoch transitions, making extensive use of graphics processing units. Through synthetic examples, we assess model performance in recovering evolutionary parameters from data generated according to different evolutionary scenarios that comprise different numbers of epochs for both nucleotide and codon substitution processes. We illustrate the usefulness of our inference framework in two different applications to empirical data sets: the selection dynamics on within-host HIV populations throughout infection and the seasonality of global influenza circulation. In both cases, our epoch model captures key features of temporal heterogeneity that remained difficult to test using ad hoc procedures.Comment: 30 pages, 6 figure, 3 table

    Unifying Viral Genetics and Human Transportation Data to Predict the Global Transmission Dynamics of Human Influenza H3N2

    Get PDF
    Information on global human movement patterns is central to spatial epidemiological models used to predict the behavior of influenza and other infectious diseases. Yet it remains difficult to test which modes of dispersal drive pathogen spread at various geographic scales using standard epidemiological data alone. Evolutionary analyses of pathogen genome sequences increasingly provide insights into the spatial dynamics of influenza viruses, but to date they have largely neglected the wealth of information on human mobility, mainly because no statistical framework exists within which viral gene sequences and empirical data on host movement can be combined. Here, we address this problem by applying a phylogeographic approach to elucidate the global spread of human influenza subtype H3N2 and assess its ability to predict the spatial spread of human influenza A viruses worldwide. Using a framework that estimates the migration history of human influenza while simultaneously testing and quantifying a range of potential predictive variables of spatial spread, we show that the global dynamics of influenza H3N2 are driven by air passenger flows, whereas at more local scales spread is also determined by processes that correlate with geographic distance. Our analyses further confirm a central role for mainland China and Southeast Asia in maintaining a source population for global influenza diversity. By comparing model output with the known pandemic expansion of H1N1 during 2009, we demonstrate that predictions of influenza spatial spread are most accurate when data on human mobility and viral evolution are integrated. In conclusion, the global dynamics of influenza viruses are best explained by combining human mobility data with the spatial information inherent in sampled viral genomes. The integrated approach introduced here offers great potential for epidemiological surveillance through phylogeographic reconstructions and for improving predictive models of disease control.status: publishe

    Virus genomes reveal factors that spread and sustained the Ebola epidemic.

    Get PDF
    The 2013-2016 West African epidemic caused by the Ebola virus was of unprecedented magnitude, duration and impact. Here we reconstruct the dispersal, proliferation and decline of Ebola virus throughout the region by analysing 1,610 Ebola virus genomes, which represent over 5% of the known cases. We test the association of geography, climate and demography with viral movement among administrative regions, inferring a classic 'gravity' model, with intense dispersal between larger and closer populations. Despite attenuation of international dispersal after border closures, cross-border transmission had already sown the seeds for an international epidemic, rendering these measures ineffective at curbing the epidemic. We address why the epidemic did not spread into neighbouring countries, showing that these countries were susceptible to substantial outbreaks but at lower risk of introductions. Finally, we reveal that this large epidemic was a heterogeneous and spatially dissociated collection of transmission clusters of varying size, duration and connectivity. These insights will help to inform interventions in future epidemics
    corecore