Search CORE

98 research outputs found

The Metropolis algorithm: A useful tool for epidemiologists

Author: Cole Stephen R
Edwards Jessie K
Keil Alexander P
Naimi Ashley I
Publication venue
Publication date: 28/06/2023
Field of study

The Metropolis algorithm is a Markov chain Monte Carlo (MCMC) algorithm used to simulate from parameter distributions of interest, such as generalized linear model parameters. The "Metropolis step" is a keystone concept that underlies classical and modern MCMC methods and facilitates simple analysis of complex statistical models. Beyond Bayesian analysis, MCMC is useful for generating uncertainty intervals, even under the common scenario in causal inference in which the target parameter is not directly estimated by a single, fitted statistical model. We demonstrate, with a worked example, pseudo-code, and R code, the basic mechanics of the Metropolis algorithm. We use the Metropolis algorithm to estimate the odds ratio and risk difference contrasting the risk of childhood leukemia among those exposed to high versus low level magnetic fields. This approach can be used for inference from Bayesian and frequentist paradigms and, in small samples, offers advantages over large-sample methods like the bootstrap.Comment: 26 pages, 3 figure

arXiv.org e-Print Archive

All your data are always missing: incorporating bias due to measurement error into the potential outcomes framework

Author: Cole Stephen R
Edwards Jessie K
Westreich Daniel
Publication venue
Publication date: 01/01/2015
Field of study

Epidemiologists often use the potential outcomes framework to cast causal inference as a missing data problem. Here, we demonstrate how bias due to measurement error can be described in terms of potential outcomes and considered in concert with bias from other sources. In addition, we illustrate how acknowledging the uncertainty that arises due to measurement error increases the amount of missing information in causal inference. We use a simple example to show that estimating the average treatment effect requires the investigator to perform a series of hidden imputations based on strong assumptions

PubMed Central

Carolina Digital Repository

Parametric assumptions equate to hidden observations: comparing the efficiency of nonparametric and parametric models for estimating time to AIDS or death in a cohort of HIV-positive women

Author: Cole Stephen R
Edwards Jessie K
Rudolph Jacqueline E
Publication venue: BioMed Central
Publication date: 19/11/2018
Field of study

Abstract Background When conducting a survival analysis, researchers might consider two broad classes of models: nonparametric models and parametric models. While nonparametric models are more flexible because they make few assumptions regarding the shape of the data distribution, parametric models are more efficient. Here we sought to make concrete the difference in efficiency between these two model types using effective sample size. Methods We compared cumulative risk of AIDS or death estimated using four survival models – nonparametric, generalized gamma, Weibull, and exponential – and data from 1164 HIV patients who were alive and AIDS-free in 1995. We added pseudo-observations to the sample until the spread of the 95% confidence limits for the nonparametric model became less than that for the parametric models. Results We found the 3-parameter generalized gamma to be a good fit to the nonparametric risk curve, but the 1-parameter exponential both underestimated and overestimated the risk at different times. Using two year-risk as an example, we had to add 354, 593, and 3960 observations for the nonparametric model to be as efficient as the generalized gamma, Weibull, and exponential models, respectively. Conclusions These added observations represent the hidden observations underlying the efficiency gained through parametric model form assumptions. If the model is correctly specified, the efficiency gain may be justified, as appeared to be the case for the generalized gamma model. Otherwise, precision will be improved, but at the cost of specification bias, as was the case for the exponential model

Carolina Digital Repository

Remdesivir and COVID-19

Author: Adimora Adaora A.
Cole Stephen R.
Edwards Jessie K.
Publication venue
Publication date: 01/01/2020
Field of study

The Panel on Antiretroviral Guidelines for Adults and Adolescents with HIV and the American Association for the Study of Liver Diseases guidelines for hepatitis C virus treatment suggest that combination therapy for severe acute respiratory syndrome coronavirus 2 infection will outperform single drugs

Carolina Digital Repository

Transportability without positivity: a synthesis of statistical and simulation modeling

Author: Cole Stephen R
Edwards Jessie K
Lessler Justin
Lofgren Eric T
Shook-Sa Bonnie E
Zivich Paul N
Publication venue
Publication date: 20/07/2023
Field of study

When estimating an effect of an action with a randomized or observational study, that study is often not a random sample of the desired target population. Instead, estimates from that study can be transported to the target population. However, transportability methods generally rely on a positivity assumption, such that all relevant covariate patterns in the target population are also observed in the study sample. Strict eligibility criteria, particularly in the context of randomized trials, may lead to violations of this assumption. Two common approaches to address positivity violations are restricting the target population and restricting the relevant covariate set. As neither of these restrictions are ideal, we instead propose a synthesis of statistical and simulation models to address positivity violations. We propose corresponding g-computation and inverse probability weighting estimators. The restriction and synthesis approaches to addressing positivity violations are contrasted with a simulation experiment and an illustrative example in the context of sexually transmitted infection testing uptake. In both cases, the proposed synthesis approach accurately addressed the original research question when paired with a thoughtfully selected simulation model. Neither of the restriction approaches were able to accurately address the motivating question. As public health decisions must often be made with imperfect target population information, model synthesis is a viable approach given a combination of empirical data and external information based on the best available knowledge

arXiv.org e-Print Archive

An Illustration of Inverse Probability Weighting to Estimate Policy-Relevant Causal Effects

Author: Cole Stephen R.
Edwards Jessie K.
Lesko Catherine R.
Mathews W. Christopher
Moore Richard D.
Mugavero Michael J.
Westreich Daniel
Publication venue
Publication date: 01/01/2016
Field of study

Traditional epidemiologic approaches allow us to compare counterfactual outcomes under 2 exposure distributions, usually 100% exposed and 100% unexposed. However, to estimate the population health effect of a proposed intervention, one may wish to compare factual outcomes under the observed exposure distribution to counterfactual outcomes under the exposure distribution produced by an intervention. Here, we used inverse probability weights to compare the 5-year mortality risk under observed antiretroviral therapy treatment plans to the 5-year mortality risk that would had been observed under an intervention in which all patients initiated therapy immediately upon entry into care among patients positive for human immunodeficiency virus in the US Centers for AIDS Research Network of Integrated Clinical Systems multisite cohort study between 1998 and 2013. Therapy-naïve patients (n = 14,700) were followed from entry into care until death, loss to follow-up, or censoring at 5 years or on December 31, 2013. The 5-year cumulative incidence of mortality was 11.65% under observed treatment plans and 10.10% under the intervention, yielding a risk difference of −1.57% (95% confidence interval: −3.08, −0.06). Comparing outcomes under the intervention with outcomes under observed treatment plans provides meaningful information about the potential consequences of new US guidelines to treat all patients with human immunodeficiency virus regardless of CD4 cell count under actual clinical conditions

PubMed Central

Carolina Digital Repository

Occupational Radon Exposure and Lung Cancer Mortality: Estimating Intervention Effects Using the Parametric g-Formula

Author: Buckley Jessie P.
Cole Stephen R.
Edwards Jessie K.
McGrath Leah J.
Richardson David B.
Schubauer-Berigan Mary K.
Publication venue
Publication date: 01/01/2014
Field of study

Traditional regression analysis techniques used to estimate associations between occupational radon exposure and lung cancer focus on estimating the effect of cumulative radon exposure on lung cancer, while public health interventions are typically based on regulating radon concentration rather than workers’ cumulative exposure. Moreover, estimating the direct effect of cumulative occupational exposure on lung cancer may be difficult in situations vulnerable to the healthy worker survivor bias

PubMed Central

Carolina Digital Repository

Accounting for Misclassified Outcomes in Binary Regression Models Using Multiple Imputation With Internal Validation Data

Author: Cole Stephen R.
Edwards Jessie K.
Richardson David B.
Troester Melissa A.
Publication venue
Publication date: 01/01/2013
Field of study

Outcome misclassification is widespread in epidemiology, but methods to account for it are rarely used. We describe the use of multiple imputation to reduce bias when validation data are available for a subgroup of study participants. This approach is illustrated using data from 308 participants in the multicenter Herpetic Eye Disease Study between 1992 and 1998 (48% female; 85% white; median age, 49 years). The odds ratio comparing the acyclovir group with the placebo group on the gold-standard outcome (physician-diagnosed herpes simplex virus recurrence) was 0.62 (95% confidence interval (CI): 0.35, 1.09). We masked ourselves to physician diagnosis except for a 30% validation subgroup used to compare methods. Multiple imputation (odds ratio (OR) = 0.60; 95% CI: 0.24, 1.51) was compared with naive analysis using self-reported outcomes (OR = 0.90; 95% CI: 0.47, 1.73), analysis restricted to the validation subgroup (OR = 0.57; 95% CI: 0.20, 1.59), and direct maximum likelihood (OR = 0.62; 95% CI: 0.26, 1.53). In simulations, multiple imputation and direct maximum likelihood had greater statistical power than did analysis restricted to the validation subgroup, yet all 3 provided unbiased estimates of the odds ratio. The multiple-imputation approach was extended to estimate risk ratios using log-binomial regression. Multiple imputation has advantages regarding flexibility and ease of implementation for epidemiologists familiar with missing data methods

PubMed Central

Carolina Digital Repository