Search CORE

206 research outputs found

Contact intervals, survival analysis of epidemic data, and estimation of R_0

Author: Kenah Eben
Publication venue
Publication date: 23/02/2010
Field of study

We argue that the time from the onset of infectiousness to infectious contact, which we call the contact interval, is a better basis for inference in epidemic data than the generation or serial interval. Since contact intervals can be right-censored, survival analysis is the natural approach to estimation. Estimates of the contact interval distribution can be used to estimate R_0 in both mass-action and network-based models.Comment: 30 pages, 4 figures; submitted to Biostatistic

arXiv.org e-Print Archive

PubMed Central

Semiparametric Relative-risk Regression for Infectious Disease Data

Author: Kenah Eben
Publication venue: 'Informa UK Limited'
Publication date: 17/10/2012
Field of study

This paper introduces semiparametric relative-risk regression models for infectious disease data based on contact intervals, where the contact interval from person i to person j is the time between the onset of infectiousness in i and infectious contact from i to j. The hazard of infectious contact from i to j is \lambda_0(\tau)r(\beta_0^T X_{ij}), where \lambda_0(\tau) is an unspecified baseline hazard function, r is a relative risk function, \beta_0 is an unknown covariate vector, and X_{ij} is a covariate vector. When who-infects-whom is observed, the Cox partial likelihood is a profile likelihood for \beta maximized over all possible \lambda_0(\tau). When who-infects-whom is not observed, we use an EM algorithm to maximize the profile likelihood for \beta integrated over all possible combinations of who-infected-whom. This extends the most important class of regression models in survival analysis to infectious disease epidemiology.Comment: 38 pages, 5 figure

arXiv.org e-Print Archive

Pairwise accelerated failure time models for infectious disease transmission with external sources of infection

Author: Kenah Eben
Sharker Yushuf
Publication venue
Publication date: 24/04/2019
Field of study

Pairwise survival analysis handles dependent happenings in infectious disease transmission data by analyzing failure times in ordered pairs of individuals. The contact interval in the pair

ij

is the time from the onset of infectiousness in

i

to infectious contact from

i

j

, where an infectious contact is sufficient to infect

j

if he or she is susceptible. The contact interval distribution determines transmission probabilities and the infectiousness profile of infected individuals. Many important questions in infectious disease epidemiology involve the effects of covariates (e.g., age or vaccination status) on transmission. Here, we generalize earlier pairwise methods in two ways: First, we introduce an accelerated failure time model that allows the contact interval rate parameter to depend on infectiousness covariates for

i

, susceptibility covariates for

j

, and pairwise covariates. Second, we show how internal infections (caused by individuals under observation) and external infections (caused environmental or community sources) can be handled simultaneously. In simulations, we show that these methods produce valid point and interval estimates and that accounting for external infections is critical to consistent estimation. Finally, we use these methods to analyze household surveillance data from Los Angeles County during the 2009 influenza A(H1N1) pandemic.Comment: 24 pages, 4 figure

arXiv.org e-Print Archive

Estimating and interpreting secondary attack risk: Binomial considered harmful

Author: Kenah Eben
Sharker Yushuf
Publication venue
Publication date: 17/07/2020
Field of study

The household secondary attack risk (SAR), often called the secondary attack rate or secondary infection risk, is the probability of infectious contact from an infectious household member A to a given household member B, where we define infectious contact to be a contact sufficient to infect B if he or she is susceptible. Estimation of the SAR is an important part of understanding and controlling the transmission of infectious diseases. In practice, it is most often estimated using binomial models such as logistic regression, which implicitly attribute all secondary infections in a household to the primary case. In the simplest case, the number of secondary infections in a household with m susceptibles and a single primary case is modeled as a binomial(m, p) random variable where p is the SAR. Although it has long been understood that transmission within households is not binomial, it is thought that multiple generations of transmission can be safely neglected when p is small. We use probability generating functions and simulations to show that this is a mistake. The proportion of susceptible household members infected can be substantially larger than the SAR even when p is small. As a result, binomial estimates of the SAR are biased upward and their confidence intervals have poor coverage probabilities even if adjusted for clustering. Accurate point and interval estimates of the SAR can be obtained using longitudinal chain binomial models or pairwise survival analysis, which account for multiple generations of transmission within households, the ongoing risk of infection from outside the household, and incomplete follow-up. We illustrate the practical implications of these results in an analysis of household surveillance data collected by the Los Angeles County Department of Public Health during the 2009 influenza A (H1N1) pandemic.Comment: 25 pages, 8 figure

arXiv.org e-Print Archive

Directory of Open Access Journals

Network-based analysis of stochastic SIR epidemic models with random and proportionate mixing

Author: Kenah Eben
Robins James M.
Publication venue
Publication date: 01/01/2007
Field of study

In this paper, we outline the theory of epidemic percolation networks and their use in the analysis of stochastic SIR epidemic models on undirected contact networks. We then show how the same theory can be used to analyze stochastic SIR models with random and proportionate mixing. The epidemic percolation networks for these models are purely directed because undirected edges disappear in the limit of a large population. In a series of simulations, we show that epidemic percolation networks accurately predict the mean outbreak size and probability and final size of an epidemic for a variety of epidemic models in homogeneous and heterogeneous populations. Finally, we show that epidemic percolation networks can be used to re-derive classical results from several different areas of infectious disease epidemiology. In an appendix, we show that an epidemic percolation network can be defined for any time-homogeneous stochastic SIR model in a closed population and prove that the distribution of outbreak sizes given the infection of any given node in the SIR model is identical to the distribution of its out-component sizes in the corresponding probability space of epidemic percolation networks. We conclude that the theory of percolation on semi-directed networks provides a very general framework for the analysis of stochastic SIR models in closed populations.Comment: 40 pages, 9 figure

arXiv.org e-Print Archive

CiteSeerX

PubMed Central

A potential outcomes approach to selection bias

Author: Kenah Eben
Publication venue
Publication date: 14/10/2021
Field of study

Along with confounding, selection bias is one of the fundamental threats to the validity of epidemiologic research. Unlike confounding, it has yet to be given a standard definition in terms of potential outcomes. Traditionally, selection bias has been defined as a systematic difference in a measure of the exposure-disease association in the study population and the population eligible for inclusion. This definition depends on the parameterization of the association between exposure and disease. The structural approach to selection bias defines selection bias as a spurious exposure-disease association within the study population that occurs when selection is a collider or a descendant of a collider on a causal path from exposure to disease in the eligible population. This definition covers only selection bias that can occur under the null hypothesis. Here, we propose a definition of selection bias in terms of potential outcomes that identifies selection bias whenever disease risks and exposure prevalences are distorted by the selection of study participants, not just a given measure of association (as in the traditional approach) or all measures of association (as in the structural approach). This definition is nonparametric, so it can be analyzed using causal graphs both under and away from the null. It unifies the theoretical frameworks used to understand selection bias and confounding, explicitly links selection to the estimation of causal effects, distinguishes clearly between internal and external validity, and simplifies the analysis of complex study designs.Comment: 25 pages, 14 figure

arXiv.org e-Print Archive

Bill Kenah Oral History Interview

Author: Kenah William
Publication venue: Scholarly Commons
Publication date: 14/07/2023
Field of study

https://scholarlycommons.pacific.edu/raymond-college/1153/thumbnail.jp

Scholarly Commons

Generation interval contraction and epidemic data analysis

Author: Kenah Eben
Lipsitch Marc
Robins James M.
Publication venue
Publication date: 01/01/2008
Field of study

The generation interval is the time between the infection time of an infected person and the infection time of his or her infector. Probability density functions for generation intervals have been an important input for epidemic models and epidemic data analysis. In this paper, we specify a general stochastic SIR epidemic model and prove that the mean generation interval decreases when susceptible persons are at risk of infectious contact from multiple sources. The intuition behind this is that when a susceptible person has multiple potential infectors, there is a ``race'' to infect him or her in which only the first infectious contact leads to infection. In an epidemic, the mean generation interval contracts as the prevalence of infection increases. We call this global competition among potential infectors. When there is rapid transmission within clusters of contacts, generation interval contraction can be caused by a high local prevalence of infection even when the global prevalence is low. We call this local competition among potential infectors. Using simulations, we illustrate both types of competition. Finally, we show that hazards of infectious contact can be used instead of generation intervals to estimate the time course of the effective reproductive number in an epidemic. This approach leads naturally to partial likelihoods for epidemic data that are very similar to those that arise in survival analysis, opening a promising avenue of methodological research in infectious disease epidemiology.Comment: 20 pages, 5 figures; to appear in Mathematical Bioscience

arXiv.org e-Print Archive

CiteSeerX

PubMed Central

Rothman diagrams: the geometry of causal inference in epidemiology

Author: Kenah Eben
Publication venue
Publication date: 23/10/2023
Field of study

Here, we explain and illustrate a geometric perspective on causal inference in cohort studies that can help epidemiologists understand the role of standardization in causal inference as well as the distinctions between confounding, effect modification, and noncollapsibility. For simplicity, we focus on a binary exposure X, a binary outcome D, and a binary confounder C that is not causally affected by X. Rothman diagrams plot risk in the unexposed on the x-axis and risk in the exposed on the y-axis. The crude risks define one point in the unit square, and the stratum-specific risks define two other points in the unit square. These three points can be used to identify confounding and effect modification, and we show briefly how these concepts generalize to confounders with more than two levels. We propose a simplified but equivalent definition of collapsibility in terms of standardization, and we show that a measure of association is collapsible if and only if all of its contour lines are straight. We illustrate these ideas using data from a study conducted in Newcastle upon Tyne, United Kingdom, where the causal effect of smoking on 20-year mortality was confounded by age. We conclude that causal inference should be taught using geometry before using regression models.Comment: 22 pages, 7 figure

arXiv.org e-Print Archive

Molecular Infectious Disease Epidemiology: Survival Analysis and Algorithms Linking Phylogenies to Transmission Trees

Author: Britton Tom
Halloran M. Elizabeth
Kenah Eben
Longini Jr Ira M.
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 04/04/2016
Field of study

Recent work has attempted to use whole-genome sequence data from pathogens to reconstruct the transmission trees linking infectors and infectees in outbreaks. However, transmission trees from one outbreak do not generalize to future outbreaks. Reconstruction of transmission trees is most useful to public health if it leads to generalizable scientific insights about disease transmission. In a survival analysis framework, estimation of transmission parameters is based on sums or averages over the possible transmission trees. A phylogeny can increase the precision of these estimates by providing partial information about who infected whom. The leaves of the phylogeny represent sampled pathogens, which have known hosts. The interior nodes represent common ancestors of sampled pathogens, which have unknown hosts. Starting from assumptions about disease biology and epidemiologic study design, we prove that there is a one-to-one correspondence between the possible assignments of interior node hosts and the transmission trees simultaneously consistent with the phylogeny and the epidemiologic data on person, place, and time. We develop algorithms to enumerate these transmission trees and show these can be used to calculate likelihoods that incorporate both epidemiologic data and a phylogeny. A simulation study confirms that this leads to more efficient estimates of hazard ratios for infectiousness and baseline hazards of infectious contact, and we use these methods to analyze data from a foot-and-mouth disease virus outbreak in the United Kingdom in 2001. These results demonstrate the importance of data on individuals who escape infection, which is often overlooked. The combination of survival analysis and algorithms linking phylogenies to transmission trees is a rigorous but flexible statistical foundation for molecular infectious disease epidemiology.Comment: 28 pages, 11 figures, 3 table

arXiv.org e-Print Archive

Directory of Open Access Journals

PubMed Central

The Francis Crick Institute