206 research outputs found

    Contact intervals, survival analysis of epidemic data, and estimation of R_0

    Full text link
    We argue that the time from the onset of infectiousness to infectious contact, which we call the contact interval, is a better basis for inference in epidemic data than the generation or serial interval. Since contact intervals can be right-censored, survival analysis is the natural approach to estimation. Estimates of the contact interval distribution can be used to estimate R_0 in both mass-action and network-based models.Comment: 30 pages, 4 figures; submitted to Biostatistic

    Semiparametric Relative-risk Regression for Infectious Disease Data

    Full text link
    This paper introduces semiparametric relative-risk regression models for infectious disease data based on contact intervals, where the contact interval from person i to person j is the time between the onset of infectiousness in i and infectious contact from i to j. The hazard of infectious contact from i to j is \lambda_0(\tau)r(\beta_0^T X_{ij}), where \lambda_0(\tau) is an unspecified baseline hazard function, r is a relative risk function, \beta_0 is an unknown covariate vector, and X_{ij} is a covariate vector. When who-infects-whom is observed, the Cox partial likelihood is a profile likelihood for \beta maximized over all possible \lambda_0(\tau). When who-infects-whom is not observed, we use an EM algorithm to maximize the profile likelihood for \beta integrated over all possible combinations of who-infected-whom. This extends the most important class of regression models in survival analysis to infectious disease epidemiology.Comment: 38 pages, 5 figure

    Pairwise accelerated failure time models for infectious disease transmission with external sources of infection

    Full text link
    Pairwise survival analysis handles dependent happenings in infectious disease transmission data by analyzing failure times in ordered pairs of individuals. The contact interval in the pair ijij is the time from the onset of infectiousness in ii to infectious contact from ii to jj, where an infectious contact is sufficient to infect jj if he or she is susceptible. The contact interval distribution determines transmission probabilities and the infectiousness profile of infected individuals. Many important questions in infectious disease epidemiology involve the effects of covariates (e.g., age or vaccination status) on transmission. Here, we generalize earlier pairwise methods in two ways: First, we introduce an accelerated failure time model that allows the contact interval rate parameter to depend on infectiousness covariates for ii, susceptibility covariates for jj, and pairwise covariates. Second, we show how internal infections (caused by individuals under observation) and external infections (caused environmental or community sources) can be handled simultaneously. In simulations, we show that these methods produce valid point and interval estimates and that accounting for external infections is critical to consistent estimation. Finally, we use these methods to analyze household surveillance data from Los Angeles County during the 2009 influenza A(H1N1) pandemic.Comment: 24 pages, 4 figure

    Estimating and interpreting secondary attack risk: Binomial considered harmful

    Full text link
    The household secondary attack risk (SAR), often called the secondary attack rate or secondary infection risk, is the probability of infectious contact from an infectious household member A to a given household member B, where we define infectious contact to be a contact sufficient to infect B if he or she is susceptible. Estimation of the SAR is an important part of understanding and controlling the transmission of infectious diseases. In practice, it is most often estimated using binomial models such as logistic regression, which implicitly attribute all secondary infections in a household to the primary case. In the simplest case, the number of secondary infections in a household with m susceptibles and a single primary case is modeled as a binomial(m, p) random variable where p is the SAR. Although it has long been understood that transmission within households is not binomial, it is thought that multiple generations of transmission can be safely neglected when p is small. We use probability generating functions and simulations to show that this is a mistake. The proportion of susceptible household members infected can be substantially larger than the SAR even when p is small. As a result, binomial estimates of the SAR are biased upward and their confidence intervals have poor coverage probabilities even if adjusted for clustering. Accurate point and interval estimates of the SAR can be obtained using longitudinal chain binomial models or pairwise survival analysis, which account for multiple generations of transmission within households, the ongoing risk of infection from outside the household, and incomplete follow-up. We illustrate the practical implications of these results in an analysis of household surveillance data collected by the Los Angeles County Department of Public Health during the 2009 influenza A (H1N1) pandemic.Comment: 25 pages, 8 figure

    Network-based analysis of stochastic SIR epidemic models with random and proportionate mixing

    Full text link
    In this paper, we outline the theory of epidemic percolation networks and their use in the analysis of stochastic SIR epidemic models on undirected contact networks. We then show how the same theory can be used to analyze stochastic SIR models with random and proportionate mixing. The epidemic percolation networks for these models are purely directed because undirected edges disappear in the limit of a large population. In a series of simulations, we show that epidemic percolation networks accurately predict the mean outbreak size and probability and final size of an epidemic for a variety of epidemic models in homogeneous and heterogeneous populations. Finally, we show that epidemic percolation networks can be used to re-derive classical results from several different areas of infectious disease epidemiology. In an appendix, we show that an epidemic percolation network can be defined for any time-homogeneous stochastic SIR model in a closed population and prove that the distribution of outbreak sizes given the infection of any given node in the SIR model is identical to the distribution of its out-component sizes in the corresponding probability space of epidemic percolation networks. We conclude that the theory of percolation on semi-directed networks provides a very general framework for the analysis of stochastic SIR models in closed populations.Comment: 40 pages, 9 figure

    A potential outcomes approach to selection bias

    Full text link
    Along with confounding, selection bias is one of the fundamental threats to the validity of epidemiologic research. Unlike confounding, it has yet to be given a standard definition in terms of potential outcomes. Traditionally, selection bias has been defined as a systematic difference in a measure of the exposure-disease association in the study population and the population eligible for inclusion. This definition depends on the parameterization of the association between exposure and disease. The structural approach to selection bias defines selection bias as a spurious exposure-disease association within the study population that occurs when selection is a collider or a descendant of a collider on a causal path from exposure to disease in the eligible population. This definition covers only selection bias that can occur under the null hypothesis. Here, we propose a definition of selection bias in terms of potential outcomes that identifies selection bias whenever disease risks and exposure prevalences are distorted by the selection of study participants, not just a given measure of association (as in the traditional approach) or all measures of association (as in the structural approach). This definition is nonparametric, so it can be analyzed using causal graphs both under and away from the null. It unifies the theoretical frameworks used to understand selection bias and confounding, explicitly links selection to the estimation of causal effects, distinguishes clearly between internal and external validity, and simplifies the analysis of complex study designs.Comment: 25 pages, 14 figure

    Bill Kenah Oral History Interview

    Get PDF
    https://scholarlycommons.pacific.edu/raymond-college/1153/thumbnail.jp

    Generation interval contraction and epidemic data analysis

    Full text link
    The generation interval is the time between the infection time of an infected person and the infection time of his or her infector. Probability density functions for generation intervals have been an important input for epidemic models and epidemic data analysis. In this paper, we specify a general stochastic SIR epidemic model and prove that the mean generation interval decreases when susceptible persons are at risk of infectious contact from multiple sources. The intuition behind this is that when a susceptible person has multiple potential infectors, there is a ``race'' to infect him or her in which only the first infectious contact leads to infection. In an epidemic, the mean generation interval contracts as the prevalence of infection increases. We call this global competition among potential infectors. When there is rapid transmission within clusters of contacts, generation interval contraction can be caused by a high local prevalence of infection even when the global prevalence is low. We call this local competition among potential infectors. Using simulations, we illustrate both types of competition. Finally, we show that hazards of infectious contact can be used instead of generation intervals to estimate the time course of the effective reproductive number in an epidemic. This approach leads naturally to partial likelihoods for epidemic data that are very similar to those that arise in survival analysis, opening a promising avenue of methodological research in infectious disease epidemiology.Comment: 20 pages, 5 figures; to appear in Mathematical Bioscience

    Rothman diagrams: the geometry of causal inference in epidemiology

    Full text link
    Here, we explain and illustrate a geometric perspective on causal inference in cohort studies that can help epidemiologists understand the role of standardization in causal inference as well as the distinctions between confounding, effect modification, and noncollapsibility. For simplicity, we focus on a binary exposure X, a binary outcome D, and a binary confounder C that is not causally affected by X. Rothman diagrams plot risk in the unexposed on the x-axis and risk in the exposed on the y-axis. The crude risks define one point in the unit square, and the stratum-specific risks define two other points in the unit square. These three points can be used to identify confounding and effect modification, and we show briefly how these concepts generalize to confounders with more than two levels. We propose a simplified but equivalent definition of collapsibility in terms of standardization, and we show that a measure of association is collapsible if and only if all of its contour lines are straight. We illustrate these ideas using data from a study conducted in Newcastle upon Tyne, United Kingdom, where the causal effect of smoking on 20-year mortality was confounded by age. We conclude that causal inference should be taught using geometry before using regression models.Comment: 22 pages, 7 figure

    Molecular Infectious Disease Epidemiology: Survival Analysis and Algorithms Linking Phylogenies to Transmission Trees

    Full text link
    Recent work has attempted to use whole-genome sequence data from pathogens to reconstruct the transmission trees linking infectors and infectees in outbreaks. However, transmission trees from one outbreak do not generalize to future outbreaks. Reconstruction of transmission trees is most useful to public health if it leads to generalizable scientific insights about disease transmission. In a survival analysis framework, estimation of transmission parameters is based on sums or averages over the possible transmission trees. A phylogeny can increase the precision of these estimates by providing partial information about who infected whom. The leaves of the phylogeny represent sampled pathogens, which have known hosts. The interior nodes represent common ancestors of sampled pathogens, which have unknown hosts. Starting from assumptions about disease biology and epidemiologic study design, we prove that there is a one-to-one correspondence between the possible assignments of interior node hosts and the transmission trees simultaneously consistent with the phylogeny and the epidemiologic data on person, place, and time. We develop algorithms to enumerate these transmission trees and show these can be used to calculate likelihoods that incorporate both epidemiologic data and a phylogeny. A simulation study confirms that this leads to more efficient estimates of hazard ratios for infectiousness and baseline hazards of infectious contact, and we use these methods to analyze data from a foot-and-mouth disease virus outbreak in the United Kingdom in 2001. These results demonstrate the importance of data on individuals who escape infection, which is often overlooked. The combination of survival analysis and algorithms linking phylogenies to transmission trees is a rigorous but flexible statistical foundation for molecular infectious disease epidemiology.Comment: 28 pages, 11 figures, 3 table
    corecore