56 research outputs found
A Fundamental Equivalence between Randomized Experiments and Observational Studies
A fundamental probabilistic equivalence between randomized experiments and observational studies is presented. Given a detailed scenario, the reader is asked to consider which of two possible study designs provides more information regarding the expected difference in an outcome due to a time-fixed treatment. A general solution is described, and a particular worked example is also provided. A mathematical proof is given in the appendix. The demonstrated equivalence helps to clarify common ground between randomized experiments and observational studies, and to provide a foundation for considering both the design and interpretation of studies
Group testing for severe acute respiratory syndrome-coronavirus 2 to enable rapid scale-up of testing and real-time surveillance of incidence
High-throughput molecular testing for severe acute respiratory syndrome-coronavirus 2 (SARS-CoV-2) may be enabled by group testing in which pools of specimens are screened, and individual specimens tested only after a pool tests positive. Several laboratories have recently published examples of pooling strategies applied to SARS-CoV-2 specimens, but overall guidance on efficient pooling strategies is lacking. Therefore we developed a model of the efficiency and accuracy of specimen pooling algorithms based on available data on SAR-CoV-2 viral dynamics. For a fixed number of tests, we estimate that programs using group testing could screen 2-20 times as many specimens compared with individual testing, increase the total number of true positive infections identified, and improve the positive predictive value of results. We compare outcomes that may be expected in different testing situations and provide general recommendations for group testing implementation. A free, publicly-available Web calculator is provided to help inform laboratory decisions on SARS-CoV-2 pooling algorithms
The Authors Respond
We read with keen interest Cinelli and Pearl’s response to our letter. A key difference in our approaches can be appreciated by examining the first line of each of our derivations
A practical example demonstrating the utility of single-world intervention graphs
Causal diagrams have become widespread in epidemiologic research. Recently developed single-world intervention graphs explicitly connect the potential outcomes framework of causal inference with causal diagrams. Here, we provide a practical example demonstrating how single-world intervention graphs can supplement traditional causal diagrams
Comparing Parametric, Nonparametric, and Semiparametric Estimators: The Weibull Trials
We use simple examples to show how the bias and standard error of an estimator depend in part on the type of estimator chosen from among parametric, nonparametric, and semiparametric candidates. We estimated the cumulative distribution function in the presence of missing data with and without an auxiliary variable. Simulation results mirrored theoretical expectations about the bias and precision of candidate estimators. Specifically, parametric maximum likelihood estimators performed best but must be "omnisciently"correctly specified. An augmented inverse probability-weighted (IPW) semiparametric estimator performed best among candidate estimators that were not omnisciently correct. In one setting, the augmented IPW estimator reduced the standard error by nearly 30%, compared with a standard Horvitz-Thompson IPW estimator; such a standard error reduction is equivalent to doubling the sample size. These results highlight the gains and losses that can be incurred when model assumptions are made in any analysis
Identifying and estimating effects of sustained interventions under parallel trends assumptions
Many research questions in public health and medicine concern sustained interventions in populations defined by substantive priorities. Existing methods to answer such questions typically require a measured covariate set sufficient to control confounding, which can be questionable in observational studies. Differences-in-differences rely instead on the parallel trends assumption, allowing for some types of time-invariant unmeasured confounding. However, most existing difference-in-differences implementations are limited to point treatments in restricted subpopulations. We derive identification results for population effects of sustained treatments under parallel trends assumptions. In particular, in settings where all individuals begin follow-up with exposure status consistent with the treatment plan of interest but may deviate at later times, a version of Robins' g-formula identifies the intervention-specific mean under stable unit treatment value assumption, positivity, and parallel trends. We develop consistent asymptotically normal estimators based on inverse-probability weighting, outcome regression, and a double robust estimator based on targeted maximum likelihood. Simulation studies confirm theoretical results and support the use of the proposed estimators at realistic sample sizes. As an example, the methods are used to estimate the effect of a hypothetical federal stay-at-home order on all-cause mortality during the COVID-19 pandemic in spring 2020 in the United States
Randomization inference with general interference and censoring
Interference occurs between individuals when the treatment (or exposure) of one individual affects the outcome of another individual. Previous work on causal inference methods in the presence of interference has focused on the setting where it is a priori assumed that there is “partial interference,” in the sense that individuals can be partitioned into groups wherein there is no interference between individuals in different groups. Bowers et al. (2012, Political Anal, 21, 97–124) and Bowers et al. (2016, Political Anal, 24, 395–403) consider randomization-based inferential methods that allow for more general interference structures in the context of randomized experiments. In this paper, extensions of Bowers et al. that allow for failure time outcomes subject to right censoring are proposed. Permitting right-censored outcomes is challenging because standard randomization-based tests of the null hypothesis of no treatment effect assume that whether an individual is censored does not depend on treatment. The proposed extension of Bowers et al. to allow for censoring entails adapting the method of Wang et al. (2010, Biostatistics, 11, 676–692) for two-sample survival comparisons in the presence of unequal censoring. The methods are examined via simulation studies and utilized to assess the effects of cholera vaccination in an individually randomized trial of 73 000 children and women in Matlab, Bangladesh
Assessing exposure effects on gene expression
In observational genomics data sets, there is often confounding of the effect of an exposure on gene expression. To adjust for confounding when estimating the exposure effect, a common approach involves including potential confounders as covariates with the exposure in a regression model of gene expression. However, when the exposure and confounders interact to influence gene expression, the fitted regression model does not necessarily estimate the overall effect of the exposure. Using inverse probability weighting (IPW) or the parametric g-formula in these instances is straightforward to apply and yields consistent effect estimates. IPW can readily be integrated into a genomics data analysis pipeline with upstream data processing and normalization, while the g-formula can be implemented by making simple alterations to the regression model. The regression, IPW, and g-formula approaches to exposure effect estimation are compared herein using simulations; advantages and disadvantages of each approach are explored. The methods are applied to a case study estimating the effect of current smoking on gene expression in adipose tissue
Unique Molecular Identifiers and Multiplexing Amplicons Maximize the Utility of Deep Sequencing To Critically Assess Population Diversity in RNA Viruses
Next generation sequencing (NGS)/deep sequencing has become an important tool in the study of viruses. The use of unique molecular identifiers (UMI) can overcome the limitations of PCR errors and PCR-mediated recombination and reveal the true sampling depth of a viral population being sequenced in an NGS experiment. This approach of enhanced sequence data represents an ideal tool to study both high and low abundance drug resistance mutations and more generally to explore the genetic structure of viral populations. Central to the use of the UMI/Primer ID approach is the creation of a template consensus sequence (TCS) for each genome sequenced. Here we describe a series of experiments to validate several aspects of the Multiplexed Primer ID (MPID) sequencing approach using the MiSeq platform. We have evaluated how multiplexing of cDNA synthesis and amplicons affects the sampling depth of the viral population for each individual cDNA and amplicon to understand the relationship between broader genome coverage versus maximal sequencing depth. We have validated reproducibility of the MPID assay in the detection of minority mutations in viral genomes. We have also examined the determinants that allow sequencing reads of PCR recombinants to contaminate the final TCS data set and show how such contamination can be limited. Finally, we provide several examples where we have applied MPID to analyze features of minority variants and describe limits on their detection in viral populations of HIV-1 and SARS-CoV-2 to demonstrate the generalizable utility of this approach with any RNA virus
The Authors Respond
We welcome the discussion by Huitfeldt and Stensrud on our recent article on generalizing study results. One assumption we listed in the set of sufficient conditions for generalizability was exchangeability between the study sample and the target population, perhaps conditional on a set of covariate
- …