206 research outputs found
Contact intervals, survival analysis of epidemic data, and estimation of R_0
We argue that the time from the onset of infectiousness to infectious
contact, which we call the contact interval, is a better basis for inference in
epidemic data than the generation or serial interval. Since contact intervals
can be right-censored, survival analysis is the natural approach to estimation.
Estimates of the contact interval distribution can be used to estimate R_0 in
both mass-action and network-based models.Comment: 30 pages, 4 figures; submitted to Biostatistic
Semiparametric Relative-risk Regression for Infectious Disease Data
This paper introduces semiparametric relative-risk regression models for
infectious disease data based on contact intervals, where the contact interval
from person i to person j is the time between the onset of infectiousness in i
and infectious contact from i to j. The hazard of infectious contact from i to
j is \lambda_0(\tau)r(\beta_0^T X_{ij}), where \lambda_0(\tau) is an
unspecified baseline hazard function, r is a relative risk function, \beta_0 is
an unknown covariate vector, and X_{ij} is a covariate vector. When
who-infects-whom is observed, the Cox partial likelihood is a profile
likelihood for \beta maximized over all possible \lambda_0(\tau). When
who-infects-whom is not observed, we use an EM algorithm to maximize the
profile likelihood for \beta integrated over all possible combinations of
who-infected-whom. This extends the most important class of regression models
in survival analysis to infectious disease epidemiology.Comment: 38 pages, 5 figure
Pairwise accelerated failure time models for infectious disease transmission with external sources of infection
Pairwise survival analysis handles dependent happenings in infectious disease
transmission data by analyzing failure times in ordered pairs of individuals.
The contact interval in the pair is the time from the onset of
infectiousness in to infectious contact from to , where an
infectious contact is sufficient to infect if he or she is susceptible. The
contact interval distribution determines transmission probabilities and the
infectiousness profile of infected individuals. Many important questions in
infectious disease epidemiology involve the effects of covariates (e.g., age or
vaccination status) on transmission. Here, we generalize earlier pairwise
methods in two ways: First, we introduce an accelerated failure time model that
allows the contact interval rate parameter to depend on infectiousness
covariates for , susceptibility covariates for , and pairwise covariates.
Second, we show how internal infections (caused by individuals under
observation) and external infections (caused environmental or community
sources) can be handled simultaneously. In simulations, we show that these
methods produce valid point and interval estimates and that accounting for
external infections is critical to consistent estimation. Finally, we use these
methods to analyze household surveillance data from Los Angeles County during
the 2009 influenza A(H1N1) pandemic.Comment: 24 pages, 4 figure
Estimating and interpreting secondary attack risk: Binomial considered harmful
The household secondary attack risk (SAR), often called the secondary attack
rate or secondary infection risk, is the probability of infectious contact from
an infectious household member A to a given household member B, where we define
infectious contact to be a contact sufficient to infect B if he or she is
susceptible. Estimation of the SAR is an important part of understanding and
controlling the transmission of infectious diseases. In practice, it is most
often estimated using binomial models such as logistic regression, which
implicitly attribute all secondary infections in a household to the primary
case. In the simplest case, the number of secondary infections in a household
with m susceptibles and a single primary case is modeled as a binomial(m, p)
random variable where p is the SAR. Although it has long been understood that
transmission within households is not binomial, it is thought that multiple
generations of transmission can be safely neglected when p is small. We use
probability generating functions and simulations to show that this is a
mistake. The proportion of susceptible household members infected can be
substantially larger than the SAR even when p is small. As a result, binomial
estimates of the SAR are biased upward and their confidence intervals have poor
coverage probabilities even if adjusted for clustering. Accurate point and
interval estimates of the SAR can be obtained using longitudinal chain binomial
models or pairwise survival analysis, which account for multiple generations of
transmission within households, the ongoing risk of infection from outside the
household, and incomplete follow-up. We illustrate the practical implications
of these results in an analysis of household surveillance data collected by the
Los Angeles County Department of Public Health during the 2009 influenza A
(H1N1) pandemic.Comment: 25 pages, 8 figure
Network-based analysis of stochastic SIR epidemic models with random and proportionate mixing
In this paper, we outline the theory of epidemic percolation networks and
their use in the analysis of stochastic SIR epidemic models on undirected
contact networks. We then show how the same theory can be used to analyze
stochastic SIR models with random and proportionate mixing. The epidemic
percolation networks for these models are purely directed because undirected
edges disappear in the limit of a large population. In a series of simulations,
we show that epidemic percolation networks accurately predict the mean outbreak
size and probability and final size of an epidemic for a variety of epidemic
models in homogeneous and heterogeneous populations. Finally, we show that
epidemic percolation networks can be used to re-derive classical results from
several different areas of infectious disease epidemiology. In an appendix, we
show that an epidemic percolation network can be defined for any
time-homogeneous stochastic SIR model in a closed population and prove that the
distribution of outbreak sizes given the infection of any given node in the SIR
model is identical to the distribution of its out-component sizes in the
corresponding probability space of epidemic percolation networks. We conclude
that the theory of percolation on semi-directed networks provides a very
general framework for the analysis of stochastic SIR models in closed
populations.Comment: 40 pages, 9 figure
A potential outcomes approach to selection bias
Along with confounding, selection bias is one of the fundamental threats to
the validity of epidemiologic research. Unlike confounding, it has yet to be
given a standard definition in terms of potential outcomes. Traditionally,
selection bias has been defined as a systematic difference in a measure of the
exposure-disease association in the study population and the population
eligible for inclusion. This definition depends on the parameterization of the
association between exposure and disease. The structural approach to selection
bias defines selection bias as a spurious exposure-disease association within
the study population that occurs when selection is a collider or a descendant
of a collider on a causal path from exposure to disease in the eligible
population. This definition covers only selection bias that can occur under the
null hypothesis. Here, we propose a definition of selection bias in terms of
potential outcomes that identifies selection bias whenever disease risks and
exposure prevalences are distorted by the selection of study participants, not
just a given measure of association (as in the traditional approach) or all
measures of association (as in the structural approach). This definition is
nonparametric, so it can be analyzed using causal graphs both under and away
from the null. It unifies the theoretical frameworks used to understand
selection bias and confounding, explicitly links selection to the estimation of
causal effects, distinguishes clearly between internal and external validity,
and simplifies the analysis of complex study designs.Comment: 25 pages, 14 figure
Bill Kenah Oral History Interview
https://scholarlycommons.pacific.edu/raymond-college/1153/thumbnail.jp
Generation interval contraction and epidemic data analysis
The generation interval is the time between the infection time of an infected
person and the infection time of his or her infector. Probability density
functions for generation intervals have been an important input for epidemic
models and epidemic data analysis. In this paper, we specify a general
stochastic SIR epidemic model and prove that the mean generation interval
decreases when susceptible persons are at risk of infectious contact from
multiple sources. The intuition behind this is that when a susceptible person
has multiple potential infectors, there is a ``race'' to infect him or her in
which only the first infectious contact leads to infection. In an epidemic, the
mean generation interval contracts as the prevalence of infection increases. We
call this global competition among potential infectors. When there is rapid
transmission within clusters of contacts, generation interval contraction can
be caused by a high local prevalence of infection even when the global
prevalence is low. We call this local competition among potential infectors.
Using simulations, we illustrate both types of competition.
Finally, we show that hazards of infectious contact can be used instead of
generation intervals to estimate the time course of the effective reproductive
number in an epidemic. This approach leads naturally to partial likelihoods for
epidemic data that are very similar to those that arise in survival analysis,
opening a promising avenue of methodological research in infectious disease
epidemiology.Comment: 20 pages, 5 figures; to appear in Mathematical Bioscience
Rothman diagrams: the geometry of causal inference in epidemiology
Here, we explain and illustrate a geometric perspective on causal inference
in cohort studies that can help epidemiologists understand the role of
standardization in causal inference as well as the distinctions between
confounding, effect modification, and noncollapsibility. For simplicity, we
focus on a binary exposure X, a binary outcome D, and a binary confounder C
that is not causally affected by X. Rothman diagrams plot risk in the unexposed
on the x-axis and risk in the exposed on the y-axis. The crude risks define one
point in the unit square, and the stratum-specific risks define two other
points in the unit square. These three points can be used to identify
confounding and effect modification, and we show briefly how these concepts
generalize to confounders with more than two levels. We propose a simplified
but equivalent definition of collapsibility in terms of standardization, and we
show that a measure of association is collapsible if and only if all of its
contour lines are straight. We illustrate these ideas using data from a study
conducted in Newcastle upon Tyne, United Kingdom, where the causal effect of
smoking on 20-year mortality was confounded by age. We conclude that causal
inference should be taught using geometry before using regression models.Comment: 22 pages, 7 figure
Molecular Infectious Disease Epidemiology: Survival Analysis and Algorithms Linking Phylogenies to Transmission Trees
Recent work has attempted to use whole-genome sequence data from pathogens to
reconstruct the transmission trees linking infectors and infectees in
outbreaks. However, transmission trees from one outbreak do not generalize to
future outbreaks. Reconstruction of transmission trees is most useful to public
health if it leads to generalizable scientific insights about disease
transmission. In a survival analysis framework, estimation of transmission
parameters is based on sums or averages over the possible transmission trees. A
phylogeny can increase the precision of these estimates by providing partial
information about who infected whom. The leaves of the phylogeny represent
sampled pathogens, which have known hosts. The interior nodes represent common
ancestors of sampled pathogens, which have unknown hosts. Starting from
assumptions about disease biology and epidemiologic study design, we prove that
there is a one-to-one correspondence between the possible assignments of
interior node hosts and the transmission trees simultaneously consistent with
the phylogeny and the epidemiologic data on person, place, and time. We develop
algorithms to enumerate these transmission trees and show these can be used to
calculate likelihoods that incorporate both epidemiologic data and a phylogeny.
A simulation study confirms that this leads to more efficient estimates of
hazard ratios for infectiousness and baseline hazards of infectious contact,
and we use these methods to analyze data from a foot-and-mouth disease virus
outbreak in the United Kingdom in 2001. These results demonstrate the
importance of data on individuals who escape infection, which is often
overlooked. The combination of survival analysis and algorithms linking
phylogenies to transmission trees is a rigorous but flexible statistical
foundation for molecular infectious disease epidemiology.Comment: 28 pages, 11 figures, 3 table
- …