603 research outputs found

    Molecular Infectious Disease Epidemiology: Survival Analysis and Algorithms Linking Phylogenies to Transmission Trees

    Full text link
    Recent work has attempted to use whole-genome sequence data from pathogens to reconstruct the transmission trees linking infectors and infectees in outbreaks. However, transmission trees from one outbreak do not generalize to future outbreaks. Reconstruction of transmission trees is most useful to public health if it leads to generalizable scientific insights about disease transmission. In a survival analysis framework, estimation of transmission parameters is based on sums or averages over the possible transmission trees. A phylogeny can increase the precision of these estimates by providing partial information about who infected whom. The leaves of the phylogeny represent sampled pathogens, which have known hosts. The interior nodes represent common ancestors of sampled pathogens, which have unknown hosts. Starting from assumptions about disease biology and epidemiologic study design, we prove that there is a one-to-one correspondence between the possible assignments of interior node hosts and the transmission trees simultaneously consistent with the phylogeny and the epidemiologic data on person, place, and time. We develop algorithms to enumerate these transmission trees and show these can be used to calculate likelihoods that incorporate both epidemiologic data and a phylogeny. A simulation study confirms that this leads to more efficient estimates of hazard ratios for infectiousness and baseline hazards of infectious contact, and we use these methods to analyze data from a foot-and-mouth disease virus outbreak in the United Kingdom in 2001. These results demonstrate the importance of data on individuals who escape infection, which is often overlooked. The combination of survival analysis and algorithms linking phylogenies to transmission trees is a rigorous but flexible statistical foundation for molecular infectious disease epidemiology.Comment: 28 pages, 11 figures, 3 table

    Combining genomics and epidemiology to track mumps virus transmission in the United States.

    Get PDF
    Unusually large outbreaks of mumps across the United States in 2016 and 2017 raised questions about the extent of mumps circulation and the relationship between these and prior outbreaks. We paired epidemiological data from public health investigations with analysis of mumps virus whole genome sequences from 201 infected individuals, focusing on Massachusetts university communities. Our analysis suggests continuous, undetected circulation of mumps locally and nationally, including multiple independent introductions into Massachusetts and into individual communities. Despite the presence of these multiple mumps virus lineages, the genomic data show that one lineage has dominated in the US since at least 2006. Widespread transmission was surprising given high vaccination rates, but we found no genetic evidence that variants arising during this outbreak contributed to vaccine escape. Viral genomic data allowed us to reconstruct mumps transmission links not evident from epidemiological data or standard single-gene surveillance efforts and also revealed connections between apparently unrelated mumps outbreaks

    Effects of memory on the shapes of simple outbreak trees

    Get PDF
    Genomic tools, including phylogenetic trees derived from sequence data, are increasingly used to understand outbreaks of infectious diseases. One challenge is to link phylogenetic trees to patterns of transmission. Particularly in bacteria that cause chronic infections, this inference is affected by variable infectious periods and infectivity over time. It is known that non-exponential infectious periods can have substantial effects on pathogens’ transmission dynamics. Here we ask how this non-Markovian nature of an outbreak process affects the branching trees describing that process, with particular focus on tree shapes. We simulate Crump-Mode-Jagers branching processes and compare different patterns of infectivity over time. We find that memory (non-Markovian-ness) in the process can have a pronounced effect on the shapes of the outbreak’s branching pattern. However, memory also has a pronounced effect on the sizes of the trees, even when the duration of the simulation is fixed. When the sizes of the trees are constrained to a constant value, memory in our processes has little direct effect on tree shapes, but can bias inference of the birth rate from trees. We compare simulated branching trees to phylogenetic trees from an outbreak of tuberculosis in Canada, and discuss the relevance of memory to this dataset

    Combining genomics and epidemiology to track mumps virus transmission in the United States

    Get PDF
    Unusually large outbreaks of mumps across the United States in 2016 and 2017 raised questions about the extent of mumps circulation and the relationship between these and prior outbreaks. We paired epidemiological data from public health investigations with analysis of mumps virus whole genome sequences from 201 infected individuals, focusing on Massachusetts university communities. Our analysis suggests continuous, undetected circulation of mumps locally and nationally, including multiple independent introductions into Massachusetts and into individual communities. Despite the presence of these multiple mumps virus lineages, the genomic data show that one lineage has dominated in the US since at least 2006. Widespread transmission was surprising given high vaccination rates, but we found no genetic evidence that variants arising during this outbreak contributed to vaccine escape. Viral genomic data allowed us to reconstruct mumps transmission links not evident from epidemiological data or standard single-gene surveillance efforts and also revealed connections between apparently unrelated mumps outbreaks

    Processes underlying rabies virus incursions across US–Canada Border as revealed by whole-genome phylogeography

    Get PDF
    Disease control programs aim to constrain and reduce the spread of infection. Human disease interventions such as wildlife vaccination play a major role in determining the limits of a pathogen’s spatial distribution. Over the past few decades, a raccoon-specific variant of rabies virus (RRV) has invaded large areas of eastern North America. Although expansion into Canada has been largely prevented through vaccination along the US border, several outbreaks have occurred in Canada. Applying phylogeographic approaches to 289 RRV whole-genome sequences derived from isolates collected in Canada and adjacent US states, we examined the processes underlying these outbreaks. RRV incursions were attributable predominantly to systematic virus leakage of local strains across areas along the border where vaccination has been conducted but also to single stochastic events such as long-distance translocations. These results demonstrate the utility of phylogeographic analysis of pathogen genomes for understanding transboundary outbreaks

    Integrated analysis of epidemiological and phylogenetic data to elucidate viral transmission dynamics

    Get PDF
    While infectious disease outbreaks are often summarised by population averages such as the reproductive number, variation between individuals in terms of onwards transmissions modulates the degree of unpredictability of an epidemic, and it needs to be accounted for in models of infection control. This heterogeneity among individuals can be quantified by the dispersion parameter k of the offspring distribution, a distribution that defines the number of secondary infections per infected individual. I have developed an inference framework to estimate k and other epidemiological parameters by fitting stochastic transmission models to both incidence time series and the pathogen phylogeny. Applying the framework to simulated data, I found that more accurate, less biased and more precise estimates of the reproductive number and k were obtained by combining epidemiologic and phylogenetic analyses. Accurately estimating k was necessary for unbiased estimates of the reproductive number, but it did not affect the accurate estimation of epidemic start date and the probability of sampling an infection. I further demonstrated that inference was possible in the presence of phylogenetic uncertainty by sampling from the posterior distribution of phylogenies. In addition to methodological contributions, I found that the inclusion of sequences in statistical inference for polio improved the precision of parameter estimates. Based on sequences collected from patients during a poliovirus outbreak, the estimated values of k were high regardless of the data used. On the other hand, the k estimates were low when a transmission model was fit to environmental sequences collected in Pakistan, which is still endemic for wild poliovirus. Furthermore, analysis of environmental sequences was informative of seasonality parameters whereas inference from incidence time series alone was not. This type of analysis using environmental sequences would be useful as polio eradication draws to a close as the number of symptomatic cases approaches zero.Open Acces

    Reconstructing transmission trees for communicable diseases using densely sampled genetic data.

    Get PDF
    Whole genome sequencing of pathogens from multiple hosts in an epidemic offers the potential to investigate who infected whom with unparalleled resolution, potentially yielding important insights into disease dynamics and the impact of control measures. We considered disease outbreaks in a setting with dense genomic sampling, and formulated stochastic epidemic models to investigate person-to-person transmission, based on observed genomic and epidemiological data. We constructed models in which the genetic distance between sampled genotypes depends on the epidemiological relationship between the hosts. A data augmented Markov chain Monte Carlo algorithm was used to sample over the transmission trees, providing a posterior probability for any given transmission route. We investigated the predictive performance of our methodology using simulated data, demonstrating high sensitivity and specificity, particularly for rapidly mutating pathogens with low transmissibility. We then analyzed data collected during an outbreak of methicillin-resistant Staphylococcus aureus in a hospital, identifying probable transmission routes and estimating epidemiological parameters. Our approach overcomes limitations of previous methods, providing a framework with the flexibility to allow for unobserved infection times, multiple independent introductions of the pathogen, and within-host genetic diversity, as well as allowing forward simulation.Funding received from the following: The European Community [Mastering Hospital Antimicrobial Resistance (MOSAR) network contract LSHP-CT-2007-037941]. The National Institute of General Medical Sciences of the National Institutes of Health under award number U54GM088558. The UK Medical Research Council (Unit Programme number U105260566). The UKCRC Translational Infection Research Initiative (MRC Grant number G1000803) and Public Health England. The Medical Research Council and Department for International Development (Grant number MR/K006924/1). The Mahidol Oxford Tropical Medicine Research Unit is part of the Wellcome Trust Major Overseas Programme in SE Asia (Grant number 106698/Z/14/Z).This is the final version of the article. It first appeared from the Institute of Mathematical Statistics via http://dx.doi.org/10.1214/15-AOAS89
    • …
    corecore