1,408 research outputs found

    Molecular Infectious Disease Epidemiology: Survival Analysis and Algorithms Linking Phylogenies to Transmission Trees

    Full text link
    Recent work has attempted to use whole-genome sequence data from pathogens to reconstruct the transmission trees linking infectors and infectees in outbreaks. However, transmission trees from one outbreak do not generalize to future outbreaks. Reconstruction of transmission trees is most useful to public health if it leads to generalizable scientific insights about disease transmission. In a survival analysis framework, estimation of transmission parameters is based on sums or averages over the possible transmission trees. A phylogeny can increase the precision of these estimates by providing partial information about who infected whom. The leaves of the phylogeny represent sampled pathogens, which have known hosts. The interior nodes represent common ancestors of sampled pathogens, which have unknown hosts. Starting from assumptions about disease biology and epidemiologic study design, we prove that there is a one-to-one correspondence between the possible assignments of interior node hosts and the transmission trees simultaneously consistent with the phylogeny and the epidemiologic data on person, place, and time. We develop algorithms to enumerate these transmission trees and show these can be used to calculate likelihoods that incorporate both epidemiologic data and a phylogeny. A simulation study confirms that this leads to more efficient estimates of hazard ratios for infectiousness and baseline hazards of infectious contact, and we use these methods to analyze data from a foot-and-mouth disease virus outbreak in the United Kingdom in 2001. These results demonstrate the importance of data on individuals who escape infection, which is often overlooked. The combination of survival analysis and algorithms linking phylogenies to transmission trees is a rigorous but flexible statistical foundation for molecular infectious disease epidemiology.Comment: 28 pages, 11 figures, 3 table

    Simultaneous reconstruction of evolutionary history and epidemiological dynamics from viral sequences with the birth-death SIR model

    Full text link
    The evolution of RNA viruses such as HIV, Hepatitis C and Influenza virus occurs so rapidly that the viruses' genomes contain information on past ecological dynamics. Hence, we develop a phylodynamic method that enables the joint estimation of epidemiological parameters and phylogenetic history. Based on a compartmental susceptible-infected-removed (SIR) model, this method provides separate information on incidence and prevalence of infections. Detailed information on the interaction of host population dynamics and evolutionary history can inform decisions on how to contain or entirely avoid disease outbreaks. We apply our Birth-Death SIR method (BDSIR) to two viral data sets. First, five human immunodeficiency virus type 1 clusters sampled in the United Kingdom between 1999 and 2003 are analyzed. The estimated basic reproduction ratios range from 1.9 to 3.2 among the clusters. All clusters show a decline in the growth rate of the local epidemic in the middle or end of the 90's. The analysis of a hepatitis C virus (HCV) genotype 2c data set shows that the local epidemic in the C\'ordoban city Cruz del Eje originated around 1906 (median), coinciding with an immigration wave from Europe to central Argentina that dates from 1880--1920. The estimated time of epidemic peak is around 1970.Comment: Journal link: http://rsif.royalsocietypublishing.org/content/11/94/20131106.ful

    Transmission analysis of a large tuberculosis outbreak in London: a mathematical modelling study using genomic data

    Get PDF
    Outbreaks of tuberculosis (TB) - such as the large isoniazid-resistant outbreak centred on London, UK, which originated in 1995 - provide excellent opportunities to model transmission of this devastating disease. Transmission chains for TB are notoriously difficult to ascertain, but mathematical modelling approaches, combined with whole-genome sequencing data, have strong potential to contribute to transmission analyses. Using such data, we aimed to reconstruct transmission histories for the outbreak using a Bayesian approach, and to use machine-learning techniques with patient-level data to identify the key covariates associated with transmission. By using our transmission reconstruction method that accounts for phylogenetic uncertainty, we are able to identify 21 transmission events with reasonable confidence, 9 of which have zero SNP distance, and a maximum distance of 3. Patient age, alcohol abuse and history of homelessness were found to be the most important predictors of being credible TB transmitters

    Bayesian inference of sampled ancestor trees for epidemiology and fossil calibration

    Full text link
    Phylogenetic analyses which include fossils or molecular sequences that are sampled through time require models that allow one sample to be a direct ancestor of another sample. As previously available phylogenetic inference tools assume that all samples are tips, they do not allow for this possibility. We have developed and implemented a Bayesian Markov Chain Monte Carlo (MCMC) algorithm to infer what we call sampled ancestor trees, that is, trees in which sampled individuals can be direct ancestors of other sampled individuals. We use a family of birth-death models where individuals may remain in the tree process after the sampling, in particular we extend the birth-death skyline model [Stadler et al, 2013] to sampled ancestor trees. This method allows the detection of sampled ancestors as well as estimation of the probability that an individual will be removed from the process when it is sampled. We show that sampled ancestor birth-death models where all samples come from different time points are non-identifiable and thus require one parameter to be known in order to infer other parameters. We apply this method to epidemiological data, where the possibility of sampled ancestors enables us to identify individuals that infected other individuals after being sampled and to infer fundamental epidemiological parameters. We also apply the method to infer divergence times and diversification rates when fossils are included among the species samples, so that fossilisation events are modelled as a part of the tree branching process. Such modelling has many advantages as argued in literature. The sampler is available as an open-source BEAST2 package (https://github.com/gavryushkina/sampled-ancestors).Comment: 34 pages (including Supporting Information), 8 figures, 1 table. Part of the work presented at Epidemics 2013 and The 18th Annual New Zealand Phylogenomics Meeting, 201

    Reconstructing transmission trees for communicable diseases using densely sampled genetic data.

    Get PDF
    Whole genome sequencing of pathogens from multiple hosts in an epidemic offers the potential to investigate who infected whom with unparalleled resolution, potentially yielding important insights into disease dynamics and the impact of control measures. We considered disease outbreaks in a setting with dense genomic sampling, and formulated stochastic epidemic models to investigate person-to-person transmission, based on observed genomic and epidemiological data. We constructed models in which the genetic distance between sampled genotypes depends on the epidemiological relationship between the hosts. A data augmented Markov chain Monte Carlo algorithm was used to sample over the transmission trees, providing a posterior probability for any given transmission route. We investigated the predictive performance of our methodology using simulated data, demonstrating high sensitivity and specificity, particularly for rapidly mutating pathogens with low transmissibility. We then analyzed data collected during an outbreak of methicillin-resistant Staphylococcus aureus in a hospital, identifying probable transmission routes and estimating epidemiological parameters. Our approach overcomes limitations of previous methods, providing a framework with the flexibility to allow for unobserved infection times, multiple independent introductions of the pathogen, and within-host genetic diversity, as well as allowing forward simulation.Funding received from the following: The European Community [Mastering Hospital Antimicrobial Resistance (MOSAR) network contract LSHP-CT-2007-037941]. The National Institute of General Medical Sciences of the National Institutes of Health under award number U54GM088558. The UK Medical Research Council (Unit Programme number U105260566). The UKCRC Translational Infection Research Initiative (MRC Grant number G1000803) and Public Health England. The Medical Research Council and Department for International Development (Grant number MR/K006924/1). The Mahidol Oxford Tropical Medicine Research Unit is part of the Wellcome Trust Major Overseas Programme in SE Asia (Grant number 106698/Z/14/Z).This is the final version of the article. It first appeared from the Institute of Mathematical Statistics via http://dx.doi.org/10.1214/15-AOAS89

    Reconstructing the recent West Nile virus lineage 2 epidemic in Europe and Italy using discrete and continuous phylogeography

    Get PDF
    West Nile virus lineage 2 (WNV-2) was mainly confined to sub-Saharan Africa until the early 2000s, when it was identified for the first time in Central Europe causing outbreaks of human and animal infection. The aim of this study was to reconstruct the origin and dispersion of WNV-2 in Central Europe and Italy on a phylodynamic and phylogeographical basis. To this aim, discrete and continuous space phylogeographical models were applied to a total of 33 newly characterised full-length viral genomes obtained from mosquitoes, birds and humans in Northern Italy in the years 2013-2015 aligned with 64 complete sequences isolated mainly in Europe. The European isolates segregated into two highly significant clades: a small one including three sequences and a large clade including the majority of isolates obtained in Central Europe since 2004. Discrete phylogeographical analysis showed that the most probable location of the root of the largest European clade was in Hungary a mean 12.78 years ago. The European clade bifurcated into two highly supported subclades: one including most of the Central/East European isolates and the other encompassing all of the isolates obtained in Greece. The continuous space phylogeographical analysis of the Italian clade showed that WNV-2 entered Italy in about 2008, probably by crossing the Adriatic sea and reaching a central area of the Po Valley. The epidemic then spread simultaneously eastward, to reach the region of the Po delta in 2013, and westward to the border area between Lombardy and Piedmont in 2014; later, the western strain changed direction southward, and reached the central area of the Po valley once again in 2015. Over a period of about seven years, the virus spread all over an area of northern Italy by following the Po river and its main tributaries

    Spatiotemporal reconstruction and transmission dynamics during the 2016-17 H5N8 highly pathogenic avian influenza epidemic in Italy

    Get PDF
    Effective control of avian diseases in domestic populations requires understanding of the transmission dynamics facilitating viral emergence and spread. In 2016–17, Italy experienced a significant avian influenza epidemic caused by a highly pathogenic A(H5N8) virus, which affected domestic premises housing around 2.7 million birds, primarily in the north‐eastern regions with the highest density of poultry farms (Lombardy, Emilia‐Romagna and Veneto). We perform integrated analyses of genetic, spatiotemporal and host data within a Bayesian phylogenetic framework. Using continuous and discrete phylogeography, we estimate the locations of movements responsible for the spread and persistence of the epidemic. The information derived from these analyses on rates of transmission between regions through time can be used to assess the success of control measures. Using an approach based on phylogenetic–temporal distances between domestic cases, we infer the presence of cryptic wild bird‐mediated transmission, information that can be used to complement existing epidemiological methods for distinguishing transmission within the domestic population from incursions across the wildlife–domestic interface, a common challenge in veterinary epidemiology. Spatiotemporal reconstruction of the epidemic reveals a highly skewed distribution of virus movements with a high proportion of shorter distance local movements interspersed with occasional long‐distance dispersal events associated with wild birds. We also show how such inference be used to identify possible instances of human‐mediated movements where distances between phylogenetically linked domestic cases are unusually high

    Integrating viral RNA sequence and epidemiological data to define transmission patterns for respiratory syncytial virus

    Get PDF
    The analyses contained herein focus on making comparisons between model inferences obtained using different scales of pathogen identification, with a particular focus on respiratory syncytial virus (RSV). A significant proportion of lower respiratory tract infections in children has been attributed to infection by RSV and as such, there has been global interest in understanding its transmission characteristics in order to plan for effective control. Mathematical models have often been used to explore potential mechanisms that drive the patterns observed in data collected at different scales. Several models have been used to explore how immunity to RSV is acquired and maintained, vaccination strategies and potential drivers of seasonality. However, most of these models do not make a distinction between the two antigenically and genetically distinct RSV groups (RSV A and RSV B), neither do they consider its ecological environment, in particular, potential interactions between RSV and other viral pathogens. This thesis therefore presents work done aimed at understanding the transmission characteristics of viral respiratory pathogens spreading in a group of households using a dynamic model of transmission The data analysed is cohort data collected between December 2009 and June 2010 from 493 individual distributed across 47 households from a rural coastal community in Kenya. Individuals in the study had nasopharyngeal swab samples collected twice weekly irrespective of symptom status. Infecting viral pathogens were identified using RT-PCR resulting in the identification of 4 main pathogens: RSV, human coronavirus, rhinovirus and adenovirus. RSV and coronavirus were further classified according to genetically distinct subgroups. Some of the RSV samples were sequenced to obtain whole genome sequences (WGS) and further classified into genetic clades/clusters. I first conducted a review of methods to identify the best way to integrate socialtemporal data and WGS genetic data into a single modelling framework for RSV. Given that the social-temporal data and genetic data were available at different sampling densities, I decided to use a model that focused on the data with the highest density. The results in this thesis are thus presented in three main chapters; the first focuses on analysing social-temporal shedding patterns of RSV identified at the group level (i.e. distinguish between RSV A and RSV B); the second incorporates the available genetic data into the model used to analyse the social-temporal data (i.e. separating RSV-A into 5 clusters, and RSV-B into 7 clusters); the third is an analysis of the interaction of two pathogens, RSV and coronavirus, identified at two different scales. One of the main findings in this thesis is that the household setting plays an important role in the spread of RSV, a finding that is made clearer with added detail on pathogen type. In the case of the data analysed here, and the social structuring from which it was collected, RSV clades appeared to mimic household structure as such identification at this level did not drastically change the transmission characteristic observed with identification at the group level. However, the combination of epidemiological and genetic data elucidated transmission chains within the household enabling the identification of the sources of infant RSV infections. For this particular study, it was inferred that the sources of infant RSV infections were both in the same household as the infant and from external sources. Where infant infections occurred in the household, the source of infection was often a child between the ages of 2-13 years. It was inferred that previous infection with one RSV group type reduced susceptibility to re-infection by heterologous group type within the same epidemic. Interactions were also observed between RSV and human coronavirus groups. In particular, previous infection with RSV B was estimated to increase susceptibility to corona OC43 by 81% (95% CrI: 40%, 134%). Detailed data of infection events in individual hosts can provide a wealth of knowledge. The inferences made from this study should be explored at larger spatial and temporal scales to determine the population level impact, and hence public-health significance, of pathogen interactions, whether these interactions are between strains of the same pathogen of between different pathogens. In planning for, and assessing the impact of, an intervention against a particular pathogen, investigators should not ignore the preexisting ecological balance and should make efforts to understand how this will be disrupted by an intervention against one or more pathogens
    • 

    corecore