316 research outputs found
Phylodynamic assessment of intervention strategies for the West African Ebola virus outbreak
Genetic analyses have provided important insights into Ebola virus spread during the recent West African outbreak, but their implications for specific intervention scenarios remain unclear. Here, we address this issue using a collection of phylodynamic approaches. We show that long-distance dispersal events were not crucial for epidemic expansion and that preventing viral lineage movement to any given administrative area would, in most cases, have had little impact. However, major urban areas were critical in attracting and disseminating the virus: preventing viral lineage movement to all three capitals simultaneously would have contained epidemic size to one-third. We also show that announcements of border closures were followed by a significant but transient effect on international virus dispersal. By quantifying the hypothetical impact of different intervention strategies, as well as the impact of barriers on dispersal frequency, our study illustrates how phylodynamic analyses can help to address specific epidemiological and outbreak control questions
Exceptional Heterogeneity in Viral Evolutionary Dynamics Characterises Chronic Hepatitis C Virus Infection.
The treatment of HCV infection has seen significant progress, particularly since the approval of new direct-acting antiviral drugs. However these clinical achievements have been made despite an incomplete understanding of HCV replication and within-host evolution, especially compared with HIV-1. Here, we undertake a comprehensive analysis of HCV within-host evolution during chronic infection by investigating over 4000 viral sequences sampled longitudinally from 15 HCV-infected patients. We compare our HCV results to those from a well-studied HIV-1 cohort, revealing key differences in the evolutionary behaviour of these two chronic-infecting pathogens. Notably, we find an exceptional level of heterogeneity in the molecular evolution of HCV, both within and among infected individuals. Furthermore, these patterns are associated with the long-term maintenance of viral lineages within patients, which fluctuate in relative frequency in peripheral blood. Together, our findings demonstrate that HCV replication behavior is complex and likely comprises multiple viral subpopulations with distinct evolutionary dynamics. The presence of a structured viral population can explain apparent paradoxes in chronic HCV infection, such as rapid fluctuations in viral diversity and the reappearance of viral strains years after their initial detection.status: publishe
Spatial Dynamics of Human-Origin H1 Influenza A Virus in North American Swine
The emergence and rapid global spread of the swine-origin H1N1/09 pandemic influenza A virus in humans underscores the importance of swine populations as reservoirs for genetically diverse influenza viruses with the potential to infect humans. However, despite their significance for animal and human health, relatively little is known about the phylogeography of swine influenza viruses in the United States. This study utilizes an expansive data set of hemagglutinin (HA1) sequences (n = 1516) from swine influenza viruses collected in North America during the period 2003–2010. With these data we investigate the spatial dissemination of a novel influenza virus of the H1 subtype that was introduced into the North American swine population via two separate human-to-swine transmission events around 2003. Bayesian phylogeographic analysis reveals that the spatial dissemination of this influenza virus in the US swine population follows long-distance swine movements from the Southern US to the Midwest, a corn-rich commercial center that imports millions of swine annually. Hence, multiple genetically diverse influenza viruses are introduced and co-circulate in the Midwest, providing the opportunity for genomic reassortment. Overall, the Midwest serves primarily as an ecological sink for swine influenza in the US, with sources of virus genetic diversity instead located in the Southeast (mainly North Carolina) and South-central (mainly Oklahoma) regions. Understanding the importance of long-distance pig transportation in the evolution and spatial dissemination of the influenza virus in swine may inform future strategies for the surveillance and control of influenza, and perhaps other swine pathogens
TreeFlow: probabilistic programming and automatic differentiation for phylogenetics
Probabilistic programming frameworks are powerful tools for statistical modelling and inference. They are not immediately generalisable to phylogenetic problems due to the particular computational properties of the phylogenetic tree object. TreeFlow is a software library for probabilistic programming and automatic differentiation with phylogenetic trees. It implements inference algorithms for phylogenetic tree times and model parameters, given a tree topology. We demonstrate how TreeFlow can be used to quickly implement and assess new models. We also show that it provides reasonable performance for gradient-based inference algorithms compared to specialized computational libraries for phylogenetics.Data processing pipeline can be found at https://github.com/christiaanjs/treeflow-paper
Tree topologies inferred using RAxML 8.2.12
Tree topologies are rooted using LSD 0.2
BEAST analyses are performed using BEAST 2.6.7
Variational inference analyses are performed using TreeFlow 0.0.1beta
Sequences have been removed H3N2 BEAST XML as a result of license conflicts. This complete version of this file is generated by the above pipeline.Funding provided by: University of AucklandCrossref Funder Registry ID: http://dx.doi.org/10.13039/501100001537Award Number:Carnivores sequence alignment accessed from benchmark in BEAST examples
H3N2 sequence alignment taken from Vaughan TG, Kühnert D, Popinga A, Welch D, Drummond AJ. Efficient Bayesian inference under the structured coalescent. Bioinformatics. 2014 Aug 15;30(16):2272-9. doi: 10.1093/bioinformatics/btu20
Modeling of the Temporal Patterns of Fluoxetine Prescriptions and Suicide Rates in the United States
BACKGROUND: To study the potential association of antidepressant use and suicide at a population level, we analyzed the associations between suicide rates and dispensing of the prototypic SSRI antidepressant fluoxetine in the United States during the period 1960–2002. METHODS AND FINDINGS: Sources of data included Centers of Disease Control and US Census Bureau age-adjusted suicide rates since 1960 and numbers of fluoxetine sales in the US, since its introduction in 1988. We conducted statistical analysis of age-adjusted population data and prescription numbers. Suicide rates fluctuated between 12.2 and 13.7 per 100,000 for the entire population from the early 1960s until 1988. Since then, suicide rates have gradually declined, with the lowest value of 10.4 per 100,000 in 2000. This steady decline is significantly associated with increased numbers of fluoxetine prescriptions dispensed from 2,469,000 in 1988 to 33,320,000 in 2002 (r(s) = −0.92; p < 0.001). Mathematical modeling of what suicide rates would have been during the 1988–2002 period based on pre-1988 data indicates that since the introduction of fluoxetine in 1988 through 2002 there has been a cumulative decrease in expected suicide mortality of 33,600 individuals (posterior median, 95% Bayesian credible interval 22,400–45,000). CONCLUSIONS: The introduction of SSRIs in 1988 has been temporally associated with a substantial reduction in the number of suicides. This effect may have been more apparent in the female population, whom we postulate might have particularly benefited from SSRI treatment. While these types of data cannot lead to conclusions on causality, we suggest here that in the context of untreated depression being the major cause of suicide, antidepressant treatment could have had a contributory role in the reduction of suicide rates in the period 1988–2002
Evolutionary distances in the twilight zone -- a rational kernel approach
Phylogenetic tree reconstruction is traditionally based on multiple sequence
alignments (MSAs) and heavily depends on the validity of this information
bottleneck. With increasing sequence divergence, the quality of MSAs decays
quickly. Alignment-free methods, on the other hand, are based on abstract
string comparisons and avoid potential alignment problems. However, in general
they are not biologically motivated and ignore our knowledge about the
evolution of sequences. Thus, it is still a major open question how to define
an evolutionary distance metric between divergent sequences that makes use of
indel information and known substitution models without the need for a multiple
alignment. Here we propose a new evolutionary distance metric to close this
gap. It uses finite-state transducers to create a biologically motivated
similarity score which models substitutions and indels, and does not depend on
a multiple sequence alignment. The sequence similarity score is defined in
analogy to pairwise alignments and additionally has the positive semi-definite
property. We describe its derivation and show in simulation studies and
real-world examples that it is more accurate in reconstructing phylogenies than
competing methods. The result is a new and accurate way of determining
evolutionary distances in and beyond the twilight zone of sequence alignments
that is suitable for large datasets.Comment: to appear in PLoS ON
Bayesian modeling of recombination events in bacterial populations
Background: We consider the discovery of recombinant segments jointly with their origins within multilocus DNA sequences from bacteria representing heterogeneous populations of fairly closely related species. The currently available methods for recombination detection capable of probabilistic characterization of uncertainty have a limited applicability in practice as the number of
strains in a data set increases.
Results: We introduce a Bayesian spatial structural model representing the continuum of origins over sites within the observed sequences, including a probabilistic characterization of uncertainty related to the origin of any particular site. To enable a statistically accurate and practically feasible approach to the analysis of large-scale data sets representing a single genus, we have developed a novel software tool (BRAT, Bayesian Recombination Tracker) implementing the model and the
corresponding learning algorithm, which is capable of identifying the posterior optimal structure and to estimate the marginal posterior probabilities of putative origins over the sites.
Conclusion: A multitude of challenging simulation scenarios and an analysis of real data from seven
housekeeping genes of 120 strains of genus Burkholderia are used to illustrate the possibilities
offered by our approach. The software is freely available for download at URL http://web.abo.fi/fak/
mnf//mate/jc/software/brat.html
The Dawn of Open Access to Phylogenetic Data
The scientific enterprise depends critically on the preservation of and open
access to published data. This basic tenet applies acutely to phylogenies
(estimates of evolutionary relationships among species). Increasingly,
phylogenies are estimated from increasingly large, genome-scale datasets using
increasingly complex statistical methods that require increasing levels of
expertise and computational investment. Moreover, the resulting phylogenetic
data provide an explicit historical perspective that critically informs
research in a vast and growing number of scientific disciplines. One such use
is the study of changes in rates of lineage diversification (speciation -
extinction) through time. As part of a meta-analysis in this area, we sought to
collect phylogenetic data (comprising nucleotide sequence alignment and tree
files) from 217 studies published in 46 journals over a 13-year period. We
document our attempts to procure those data (from online archives and by direct
request to corresponding authors), and report results of analyses (using
Bayesian logistic regression) to assess the impact of various factors on the
success of our efforts. Overall, complete phylogenetic data for ~60% of these
studies are effectively lost to science. Our study indicates that phylogenetic
data are more likely to be deposited in online archives and/or shared upon
request when: (1) the publishing journal has a strong data-sharing policy; (2)
the publishing journal has a higher impact factor, and; (3) the data are
requested from faculty rather than students. Although the situation appears
dire, our analyses suggest that it is far from hopeless: recent initiatives by
the scientific community -- including policy changes by journals and funding
agencies -- are improving the state of affairs
Phylogeography of Japanese encephalitis virus:genotype is associated with climate
The circulation of vector-borne zoonotic viruses is largely determined by the overlap in the geographical distributions of virus-competent vectors and reservoir hosts. What is less clear are the factors influencing the distribution of virus-specific lineages. Japanese encephalitis virus (JEV) is the most important etiologic agent of epidemic encephalitis worldwide, and is primarily maintained between vertebrate reservoir hosts (avian and swine) and culicine mosquitoes. There are five genotypes of JEV: GI-V. In recent years, GI has displaced GIII as the dominant JEV genotype and GV has re-emerged after almost 60 years of undetected virus circulation. JEV is found throughout most of Asia, extending from maritime Siberia in the north to Australia in the south, and as far as Pakistan to the west and Saipan to the east. Transmission of JEV in temperate zones is epidemic with the majority of cases occurring in summer months, while transmission in tropical zones is endemic and occurs year-round at lower rates. To test the hypothesis that viruses circulating in these two geographical zones are genetically distinct, we applied Bayesian phylogeographic, categorical data analysis and phylogeny-trait association test techniques to the largest JEV dataset compiled to date, representing the envelope (E) gene of 487 isolates collected from 12 countries over 75 years. We demonstrated that GIII and the recently emerged GI-b are temperate genotypes likely maintained year-round in northern latitudes, while GI-a and GII are tropical genotypes likely maintained primarily through mosquito-avian and mosquito-swine transmission cycles. This study represents a new paradigm directly linking viral molecular evolution and climate
- …