38 research outputs found

    INVERSE PROBABILITY WEIGHTING AND OUTCOME REGRESSION APPROACHES IN CAUSAL INFERENCE AND SURVEY SAMPLING

    Get PDF
    Survey sampling and causal inference share much of the same theoretical foundation. Both fields commonly use estimation methods that rely on randomization-based or prediction-based inferential paradigms, and inverse-probability weighting (IPW) and outcome regression methods are common in both fields (Lohr, 2010; Hernan and Robins, 2020). IPW estimators are used in conjunction with marginal structural models (MSMs) to estimate causal effects from observational studies by controlling for confounding. The parametric g-formula is an outcome regression approach utilized to make causal estimates in the presence of confounding by directly modeling the outcome as a function of the exposure and confounding variables and then integrating over the distribution of the confounders. IPW estimators are fundamental in survey sampling, as they appropriately account for each unit's probability of selection within a finite population and can be further adjusted to account for nonresponding units and undercoverage of the target population. Under a prediction-based inferential paradigm, outcome regression is used to impute outcomes for units not selected into the sample based on data from sampled units. We develop and compare methods based on IPW and outcome regression with applications in survey sampling and causal inference. Our first paper develops methods to estimate the number of HIV-positive persons incarcerated in North Carolina jails. Study data are derived from record-linkage techniques and are incomplete. Survey sampling methods are used to adjust estimates from a portion of counties to make state-level estimates that are representative of all counties. An IPW estimator is compared with an estimator based on outcome regression in simulations and with preliminary study data. A common technique for sample size determination for complex sample surveys is to make use of the design effect, the ratio of the variance of an estimator under a complex sample design to the variance of the estimator under a simple random sample (Kish,1965). Design effects allow researchers to calculate sample sizes under the simpler design and then inflate them to account for the use of weights in the analysis. In our second paper, we extend the theory of design effects to causal inference. The design effect approximation can be used to design causal studies that will be analyzed using MSM with IPW to control for confounding. MSMs, the parametric g-formula, and doubly robust estimators are commonly used to make causal estimates for observational studies when the outcome of interest is continuous, binary, or categorical. In our third paper, we provide a theoretical justification for the use of these methods when the outcome is a count. We consider methods to account for overdispersion, zero-inflation, and data heaping, a common type of measurement error for count data. We present estimators for causal rate ratios along with their properties and compare the three classes of estimators via simulations. We demonstrate these methods using data from the Women's Interagency HIV Study to assess the effect of incarceration on the number of sexual partners in the subsequent six-month period.Doctor of Public Healt

    Assessing COVID-19 Vaccine Effectiveness in Observational Studies via Nested Trial Emulation

    Full text link
    Observational data are often used to estimate real-world effectiveness and durability of coronavirus disease 2019 (COVID-19) vaccines. A sequence of nested trials can be emulated to draw inference from such data while minimizing selection bias, immortal time bias, and confounding. Typically, when nested trial emulation (NTE) is employed, effect estimates are pooled across trials to increase statistical efficiency. However, such pooled estimates may lack a clear interpretation when the treatment effect is heterogeneous across trials. In the context of COVID-19, vaccine effectiveness quite plausibly will vary over calendar time due to newly emerging variants of the virus. This manuscript considers a NTE inverse probability weighted estimator of vaccine effectiveness that may vary over calendar time, time since vaccination, or both. Statistical testing of the trial effect homogeneity assumption is considered. Simulation studies are presented examining the finite-sample performance of these methods under a variety of scenarios. The methods are used to estimate vaccine effectiveness against COVID-19 outcomes using observational data on over 120,000 residents of Abruzzo, Italy during 2021.Comment: 27 pages, 2 figure

    A tale of two studies: Study design and our understanding of SARS-CoV-2 seroprevalence

    Get PDF
    The COVID-19 pandemic is arguably the most important public health crisis of the last century. To date, infections with the SARS-CoV-2 virus have caused nearly 300,000 deaths in the United States alone [1], while also contributing to substantial excess morbidity and mortality from delayed and deferred care [2]. In addition to the direct and indirect health impacts, policies intended to limit the spread of the disease have resulted in large-scale disruptions to education systems, economic activity, and social networks. Put simply, the COVID-19 pandemic has impacted the daily lives of nearly all Americans in a way that no other health crisis has in our lifetimes

    Transportability without positivity: a synthesis of statistical and simulation modeling

    Full text link
    When estimating an effect of an action with a randomized or observational study, that study is often not a random sample of the desired target population. Instead, estimates from that study can be transported to the target population. However, transportability methods generally rely on a positivity assumption, such that all relevant covariate patterns in the target population are also observed in the study sample. Strict eligibility criteria, particularly in the context of randomized trials, may lead to violations of this assumption. Two common approaches to address positivity violations are restricting the target population and restricting the relevant covariate set. As neither of these restrictions are ideal, we instead propose a synthesis of statistical and simulation models to address positivity violations. We propose corresponding g-computation and inverse probability weighting estimators. The restriction and synthesis approaches to addressing positivity violations are contrasted with a simulation experiment and an illustrative example in the context of sexually transmitted infection testing uptake. In both cases, the proposed synthesis approach accurately addressed the original research question when paired with a thoughtfully selected simulation model. Neither of the restriction approaches were able to accurately address the motivating question. As public health decisions must often be made with imperfect target population information, model synthesis is a viable approach given a combination of empirical data and external information based on the best available knowledge

    Estimating SARS-CoV-2 seroprevalence

    Get PDF
    Governments and public health authorities use seroprevalence studies to guide responses to the COVID-19 pandemic. Seroprevalence surveys estimate the proportion of individuals who have detectable SARS-CoV-2 antibodies. However, serologic assays are prone to misclassification error, and non-probability sampling may induce selection bias. In this paper, non-parametric and parametric seroprevalence estimators are considered that address both challenges by leveraging validation data and assuming equal probabilities of sample inclusion within covariate-defined strata. Both estimators are shown to be consistent and asymptotically normal, and consistent variance estimators are derived. Simulation studies are presented comparing the estimators over a range of scenarios. The methods are used to estimate severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) seroprevalence in New York City, Belgium, and North Carolina

    Social isolation and psychological distress among southern U.S. college students in the era of COVID-19

    Get PDF
    Background College students are at heightened risk for negative psychological outcomes due to COVID-19. We examined the prevalence of psychological distress and its association with social isolation among public university students in the southern United States. Methods A cross-sectional survey was emailed to all University of North Carolina-Chapel Hill students in June 2020 and was open for two weeks. Students self-reported if they were self-isolating none, some, most, or all of the time. Validated screening instruments were used to assess clinically significant symptoms of depression, loneliness, and increased perceived stress. The data was weighted to the complete student population. Results 7,012 completed surveys were included. Almost two-thirds (64%) of the students reported clinically significant depressive symptoms and 65% were categorized as lonely. An estimated 64% of students reported self-isolating most or all of the time. Compared to those self-isolating none of the time, students self-isolating some of the time were 1.78 (95% CI 1.37, 2.30) times as likely to report clinically significant depressive symptoms, and students self-isolating most or all of the time were 2.12 (95% CI 1.64, 2.74) and 2.27 (95% CI 1.75, 2.94) times as likely to report clinically significant depressive symptoms, respectively. Similar associations between self-isolation and loneliness and perceived stress were observed. Conclusions The prevalence of adverse mental health indicators among this sample of university students in June 2020 was exceptionally high. University responses to the COVID-19 pandemic should prioritize student mental health and prepare a range of support services to mitigate mental health consequences as the pandemic continues to evolve

    Malaria prevalence and long-lasting insecticidal net use in rural western Uganda: results of a cross-sectional survey conducted in an area of highly variable malaria transmission intensity.

    Get PDF
    BACKGROUND: Long-lasting insecticidal nets (LLINs) remain a cornerstone of malaria control, but strategies to sustain universal coverage and high rates of use are not well-defined. A more complete understanding of context-specific factors, including transmission intensity and access to health facilities, may inform sub-district distribution approaches and tailored messaging campaigns. METHODS: A cross-sectional survey of 2190 households was conducted in a single sub-county of western Uganda that experiences highly variable malaria transmission intensity. The survey was carried out approximately 3 years after the most recent mass distribution campaign. At each household, study staff documented reported LLIN use and source among children 2 to 10 years of age and performed a malaria rapid diagnostic test. Elevation and distance to the nearest health facility was estimated for each household. Associations between parasite prevalence and LLIN use were estimated from log binomial regression models with elevation and distance to clinic being the primary variables of interest. RESULTS: Overall, 6.8% (148 of 2170) of children age 2-10 years of age had a positive RDT result, yielding a weighted estimate of 5.8% (95% confidence interval [CI] 5.4-6.2%). There was substantial variability in the positivity rates among villages, with the highest elevation villages having lower prevalence than lowest-elevation villages (p < .001). Only 64.7% (95% CI 64.0-65.5%) of children were reported to have slept under a LLIN the previous night. Compared to those living < 1 km from a health centre, households at ≥ 2 km were less likely to report the child sleeping under a LLIN (RR 0.86, 95% CI 0.83-0.89, p < .001). Households located farther from a health centre received a higher proportion of LLINs from government distributions compared to households living closer to health centres. CONCLUSIONS: LLIN use and sourcing was correlated with household elevation and estimated distance to the nearest health facility. The findings suggest that current facility-based distribution strategies are limited in their reach. More frequent mass distribution campaigns and complementary approaches are likely required to maintain universal LLIN coverage and high rates of use among children in rural Uganda

    North Carolina public school teachers’ contact patterns and mask use within and outside of school during the pre-vaccine phase of the COVID-19 pandemic

    Get PDF
    Background : Teachers are central to school-associated transmission networks, but little is known about their behavioral patterns during the COVID-19 pandemic. Methods : We conducted a cross-sectional survey of 700 North Carolina public school teachers in four districts open to in-person learning in November-December 2020 (pre-COVID-19 vaccines). We assessed indoor and outdoor time spent, numbers of people encountered at 94%) reported wearing masks inside school, stores, and salons; intermediate percentages (∼50%-85%) inside places of worship, bars/restaurants, and recreational settings; and few (<25%) in their or others’ homes. Approximately half reported daily close contact with students. Conclusions : As schools reopened in the COVID-19 pandemic, potential transmission opportunities arose through close contacts within and outside of school, along with suboptimal mask use by teachers and/or those around them. Our granular estimates underscore the importance of multi-layered mitigation strategies and can inform interventions and mathematical models addressing school-associated transmission
    corecore