18 research outputs found

    Bayesian phylogeography of influenza A/H3N2 for the 2014-15 season in the United States using three frameworks of ancestral state reconstruction

    Get PDF
    abstract: Ancestral state reconstructions in Bayesian phylogeography of virus pandemics have been improved by utilizing a Bayesian stochastic search variable selection (BSSVS) framework. Recently, this framework has been extended to model the transition rate matrix between discrete states as a generalized linear model (GLM) of genetic, geographic, demographic, and environmental predictors of interest to the virus and incorporating BSSVS to estimate the posterior inclusion probabilities of each predictor. Although the latter appears to enhance the biological validity of ancestral state reconstruction, there has yet to be a comparison of phylogenies created by the two methods. In this paper, we compare these two methods, while also using a primitive method without BSSVS, and highlight the differences in phylogenies created by each. We test six coalescent priors and six random sequence samples of H3N2 influenza during the 2014–15 flu season in the U.S. We show that the GLMs yield significantly greater root state posterior probabilities than the two alternative methods under five of the six priors, and significantly greater Kullback-Leibler divergence values than the two alternative methods under all priors. Furthermore, the GLMs strongly implicate temperature and precipitation as driving forces of this flu season and nearly unanimously identified a single root state, which exhibits the most tropical climate during a typical flu season in the U.S. The GLM, however, appears to be highly susceptible to sampling bias compared with the other methods, which casts doubt on whether its reconstructions should be favored over those created by alternate methods. We report that a BSSVS approach with a Poisson prior demonstrates less bias toward sample size under certain conditions than the GLMs or primitive models, and believe that the connection between reconstruction method and sampling bias warrants further investigation.The article is published at http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.100538

    Generalized Linear Models in Bayesian Phylogeography

    Get PDF
    abstract: Bayesian phylogeography is a framework that has enabled researchers to model the spatiotemporal diffusion of pathogens. In general, the framework assumes that discrete geographic sampling traits follow a continuous-time Markov chain process along the branches of an unknown phylogeny that is informed through nucleotide sequence data. Recently, this framework has been extended to model the transition rate matrix between discrete states as a generalized linear model (GLM) of predictors of interest to the pathogen. In this dissertation, I focus on these GLMs and describe their capabilities, limitations, and introduce a pipeline that may enable more researchers to utilize this framework. I first demonstrate how a GLM can be employed and how the support for the predictors can be measured using influenza A/H5N1 in Egypt as an example. Secondly, I compare the GLM framework to two alternative frameworks of Bayesian phylogeography: one that uses an advanced computational technique and one that does not. For this assessment, I model the diffusion of influenza A/H3N2 in the United States during the 2014-15 flu season with five methods encapsulated by the three frameworks. I summarize metrics of the phylogenies created by each and demonstrate their reproducibility by performing analyses on several random sequence samples under a variety of population growth scenarios. Next, I demonstrate how discretization of the location trait for a given sequence set can influence phylogenies and support for predictors. That is, I perform several GLM analyses on a set of sequences and change how the sequences are pooled, then show how aggregating predictors at four levels of spatial resolution will alter posterior support. Finally, I provide a solution for researchers that wish to use the GLM framework but may be deterred by the tedious file-manipulation requirements that must be completed to do so. My pipeline, which is publicly available, should alleviate concerns pertaining to the difficulty and time-consuming nature of creating the files necessary to perform GLM analyses. This dissertation expands the knowledge of Bayesian phylogeographic GLMs and will facilitate the use of this framework, which may ultimately reveal the variables that drive the spread of pathogens.Dissertation/ThesisDoctoral Dissertation Biomedical Informatics 201

    Exploring the phylodynamics, genetic reassortment and RNA secondary structure formation patterns of orthomyxoviruses by comparative sequence analysis

    Get PDF
    RNA viruses are among the most virulent microorganisms that threaten the health of humans and livestock. Among the most socio-economically important of the known RNA viruses are those found in the family Orthomyxovirus. In this era of rapid low-cost genome sequencing and advancements in computational biology techniques, many previously difficult research questions relating to the molecular epidemiology and evolutionary dynamics of these viruses can now be answered with ease. Using sequence data together with associated meta-data, in chapter two of this dissertation I tested the hypothesis that the Influenza A/H1N1 2009 pandemic virus was introduced multiple times into Africa, and subsequently dispersed heterogeneously across the continent. I further tested to what degree factors such as road distances and air travel distances impacted the observed pattern of spread of this virus in Africa using a generalised linear modelbased approach. The results suggested that their were multiple simultaneous introductions of 2009 pandemic A/H1N1 into Africa, and geographical distance and human mobility through air travel played an important role towards dissemination. In chapter three, I set out to test two hypotheses: (1) that there is no difference in the frequency of reassortments among the segments that constitute influenza virus genomes; and (2) that there is epochal temporal reassortment among influenza viruses and that all geographical regions are equally likely sources of epidemiologically important influenza virus reassortant lineages. The findings suggested that surface segments are more frequently exchanges than internal genes and that North America/Asia, Oceania, and Asia could be the most likely source locations for reassortant Influenza A, B and C virus lineages respectively. In chapter four of this thesis, I explored the formation of RNA secondary structures within the genomes of orthomyxoviruses belonging to five genera: Influenza A, B and C, Infectious Salmon Anaemia Virus and Thogotovirus using in silico RNA folding predictions and additional molecular evolution and phylogenetic tests to show that structured regions may be biologically functional. The presence of some conserved structures across the five genera is likely a reflection of the biological importance of these structures, warranting further investigation regarding their role in the evolution and possible development of antiviral resistance. The studies herein demonstrate that pathogen genomics-based analytical approaches are useful both for understanding the mechanisms that drive the evolution and spread of rapidly evolving viral pathogens such as orthomyxoviruses, and for illuminating how these approaches could be leveraged to improve the management of these pathogens

    The phylogeography, epidemiology and determinants of Maize streak virus dispersal across Africa and the adjacent Indian Ocean Islands

    Get PDF
    >Magister Scientiae - MScMaize streak disease (MSD), caused by variants of the Maize streak virus (MSV) A strain, is the world's third and Africa’s most important maize foliar disease. Outbreaks of the disease occur frequently and in an erratic fashion across Africa and Islands in the Indian Ocean causing devastating yield losses such that the emergence, resurgence and rapid diffusion of MSV-A variants in this region presents a serious threat to maize production, farmer livelihoods and food security. To compliment current MSD management systems, a total of 689 MSV-A full genomes sampled over a 32 year period (1979-2011) from 20 countries across Africa and the adjacent Indian Ocean Islands, 286 of which were novel, were used to estimate: (i) the levels of genetic diversity using MEGA and the Sequence Demarcation Tool v1.2 (SDT); (ii) the times of occurrence and distribution of recombination using the recombination detection program (RDP v.4) and the genetic algorithm for recombination detection (GARD); (iii) selection pressure on codon positions using PARRIS and FUBAR methods implemented on the DATAMONKEY web server; (iv) reconstruct the history of spatio-temporal diffusion for MSV-A using the discrete phylogeographic models implemented in BEAST v1.8.1; (v) characterize source-sink dynamics and identify predictor variables driving MSV-A dispersal using the generalized linear models, again implemented in BEAST v1.8.1. Isolates used displayed low levels of genetic diversity (0.017 mean pairwise distance and ≥ 98% nucleotide sequence identities), and a well-structured geographical distribution where all of the 233 novel isolates clustered together with the -A1 strains. A total of 34 MSV inter-strain recombination events and 33 MSV-A intra-strain recombination events, 15 of which have not been reported in previous analyses (Owor et al., 2007, Varsani et al., 2008 and Monjane et al., 2011), were detected. The majority of intra-strain MSV-A recombination events detected were inferred to have occurred within the last six decades, the oldest and most conserved of these being events 19, 26 and 28 whereas the most recent events were 8, 16, 17, 21, 23, and 29. Intra-strain recombination events 20, 25 and 33, were widely distributed amongst East African MSV-A samples, whereas events 16, 21 and 23, occurred more frequently within West African MSV-A samples. Events 1, 4, 8, 10, 14, 17, 19, 22, 24, 25, 26, 28, and 29 were more widely distributed across East, West and Southern Africa and the adjacent Indian Ocean Islands. Whereas codon positions 12 and 19 within motif I in the coat protein transcript, and four out of the seven codon positions (147, 166, 195, 203, 242, 262, 267) in the Rep transcript (codons 195 and 203 in the Rb motif and codons 262 and 267 in site B of motif IV), evolved under strong positive selection pressure, those in the movement protein (MP) and RepA protein encoding genes evolved neutrally and under negative selection pressure respectively. Phylogeographic analyses revealed that MSV-A first emerged in Zimbabwe around 1938 (95% HPD 1904 - 1956), and its dispersal across Africa and the adjacent Indian Ocean Islands was achieved through approximately 34 migration events, 19 of which were statistically supported using Bayes factor (BF) tests. The higher than previously reported mean nucleotide substitution rate [9.922 × 10-4 (95% HPD 8.54 × 10-4 to 1.1317 × 10-3) substitutions per site per year)] for the full genome recombination-free MSV-A dataset H estimated was possibly a result of high nucleotide substitution rates being conserved among geminiviruses such as MSV as previously suggested. Persistence of MSV-A was highest in source locations that include Zimbabwe, followed by South Africa, Uganda, and Kenya. These locations were characterized by high average annual precipitation; moderately high average annual temperatures; high seasonal changes; high maize yield; high prevalence of undernourishment; low trade imports and exports; high GDP per capita; low vector control pesticide usage; high percentage forest land area; low percentage arable land; high population densities, and were in close proximity to sink locations. Dispersal of MSV-A was frequent between locations that received high average annual rainfall, had high percentage forest land area, occupied high latitudes and experienced similar climatic seasons, had high GDP per capita and had balanced maize import to export ratios, and were in close geographical proximity.National Research Foundation (NRF), the Poliomyelitis Research Foundation (PRF), and the Thuthuka Boar

    The landscape epidemiology of canine rabies virus in Tanzania

    Get PDF
    Infectious diseases pose a significant threat to animal and human health across the globe, with much of the burden falling on low-income countries. Despite efforts to control many of these diseases, very few have ever been eradicated. Their dynamics are often embedded in complex, heterogeneous landscapes defined by interacting population and landscape level processes. As such, landscape heterogeneity plays a key role in driving disease transmission and persistence. Incorporating landscape heterogeneity in studies of pathogen dynamics is challenging but the accessibility of data, particularly next generation sequencing data, has opened new avenues of research. Landscape epidemiology involves using an integrated approach to understand spatial patterns of disease, using methods that combine landscape genetics, ecology and epidemiology. in this thesis I use these integrative methods to determine the underlying mechanisms facilitating the spread and persistence of canine rabies virus in Tanzania. Whole genome level characterisation of rabies virus samples was achieved and used in combination with cutting-edge inference techniques to explore spatial patterns of rabies at different spatial scales. Phylogeographic patterns were able to characterise spatial scales of endemic rabies transmission in Tanzania, uncovering strong viral population structure at sub-continental levels with evidence of a more fluid dispersal dynamic at local ( less than 100km2 area) spatial scales . Within-country phylogeographic patterns revealed large regional movements within Tanzania that could be attributed to human-mediated movements and revealed the presence of multiple co-circulating lineages within a single administrative district. Finely resolved incidence data from the Serengeti District complemented with whole genome sequences enabled the exploration of local scales of transmission in more detail. By extending phylogeographic diffusion models to incorporate landscape heterogeneity I was able to uncover evidence supporting landscape predictors of rabies diffusion. While much of the spatial structure was attributable to the effects of isolation by distance, landscape predictors had discernible effects on diffusion. In particular, rivers appeared to act as a barrier to dispersal and road networks facilitated diffusion and I found evidence to support vaccination as an effective control measure for canine rabies in the Serengeti District. Importantly, I also found evidence to support vaccination as resistance to diffusion and therefore an effective control measure for dog rabies. As a complementary approach a space-time-genetic algorithm was used to determine who-infected-whom in the Serengeti District. The model explicitly accounted for the possibility of exogenous sources of infection and how to incorporate genetic data available for only a proportion of samples. Direct transmission events were estimated between 42% of observed cases and highlighted the co-circulation of two major lineages in both time and space. Direct transmission events predominantly occurred over very small distances, less than 1km, but a large proportion of cases had unobserved sources that could represent transmission from dogs in neighbouring regions or larger indirect transmission events. A future development of the model is to delineate between these possibilities to assess the true contribution of exogenous sources to the system dynamic. Ultimately these integrative models are at an early stage of development but highlight the power of genetic data to delineate fine-scale transmission patterns. The results from this thesis suggest that landscape features such as rivers could be exploited as barriers in step-wise vaccination campaigns and highlight the utility of genetic surveillance to monitor control and elimination as rabies management progresses

    Population genomic analysis of bacterial pathogen niche adaptation

    Get PDF
    Globally disseminated bacterial pathogens frequently cause epidemics that are of major importance in public health. Of particular significance is the capacity for some of these bacteria to switch into a new environment leading to the emergence of pathogenic clones. Understanding the evolution and epidemiology of such pathogens is essential for designing rational ways for prevention, diagnosis and treatment of the diseases they cause. Whole-genome sequencing of multiple isolates facilitating comparative genomics and phylogenomic analyses provides high-resolution insights, which are revolutionizing our understanding of infectious diseases. In this thesis, a range of population genomic analyses are employed to study the molecular mechanisms and the evolutionary dynamics of bacterial pathogen niche adaptation, specifically between humans, animals and the environment. A large-scale population genomic approach was used to provide a global perspective of the host-switching events that have defined the evolution of Staphylococcus aureus in the context of its host-species. To investigate the genetic basis of host-adaptation, we performed genome-wide association analysis, revealing an array of accessory genes linked to S. aureus host-specificity. In addition, positive selection analysis identified biological pathways encoded in the core genome that are under diversifying selection in different host-species, suggesting a role in host-adaptation. These findings provide a high-resolution view of the evolutionary landscape of a model multi-host pathogen and its capacity to undergo changes in host ecology by genetic adaptation. To further explore S. aureus host-adaptive evolution, we examined the population dynamics of this pathogen after a simulated host-switch event. S. aureus strains of human origin were used to infect the mammary glands of sheep, and bacteria were passaged in multiple animals to simulate onward transmission events. Comparative genomics of passaged isolates allowed us to characterize the genetic changes acquired during the early stages of evolution in a novel host-species. Co-infection experiments using progenitor and passaged strains indicated that accumulated mutations contributed to enhanced fitness, indicating adaptation. Within-host population genomic analysis revealed the existence of population bottlenecks associated with transmission and establishment of infection in new hosts. Computational simulations of evolving genomes under regular bottlenecks supported that the fitness gain of beneficial mutations is high enough to overcome genetic drift and sweep through the population. Overall, these data provide new information relating to the critical early events associated with adaptation to novel host-species. Finally, population genomics was used to study the total diversity of Legionella longbeachae from patient and environmental sources and to investigate the epidemiology of a L. longbeachae outbreak in Scotland. We analysed the genomes of isolates from a cluster of legionellosis cases linked to commercial growing media in Scotland and of non-outbreak-associated strains from this and other countries. Extensive genetic diversity across the L. longbeachae species was identified, associated with intraspecies and interspecies gene flow, and a wide geographic distribution of closely related genotypes. Of note, a highly diverse pool of L. longbeachae genotypes within compost samples that precluded the genetic establishment of an infection source was observed. These data represent a view of the genomic diversity of this pathogen that will inform strategies for investigating future outbreaks. Overall, our findings demonstrate the application of population genomics to understand the molecular mechanisms and the evolutionary dynamics of bacterial adaptation to different ecological niches, and provide new insights relevant to other major bacterial pathogens with the capacity to spread between environments

    The Anthropology of Epidemics

    Get PDF
    Over the past decades, infectious disease epidemics have come to increasingly pose major global health challenges to humanity. The Anthropology of Epidemics approaches epidemics as total social phenomena: processes and events which encompass and exercise a transformational impact on social life whilst at the same time functioning as catalysts of shifts and ruptures as regards human/non-human relations. Bearing a particular mark on subject areas and questions which have recently come to shape developments in anthropological thinking, the volume brings epidemics to the forefront of anthropological debate, as an exemplary arena for social scientific study and analysis
    corecore