1 research outputs found

    Design-Based Methods for the Analysis of Modern Randomized Experiments

    Full text link
    Randomized experiments are increasingly prevalent across a variety of fields, particularly in the social sciences and medicine. This is due in part to their reputation as the "gold standard" for establishing causal relationships. The proliferation of randomized experiments has resulted in a variety of challenges in a time where large data sets are becoming more common. For some experiments, a large number of pretreatment covariates are available for each participant. It is common to make adjustments for small imbalances in these baseline covariates when analyzing the results of a randomized experiment. Traditional covariate adjustment methods such as linear regression can perform poorly or fail entirely when the number of covariates is large. This can be solved by first performing model selection, which may lead to concerns about data snooping and the validity of post-selection inferences. Several authors have suggested specifying the statistical analysis in advance to address this issue. However, it may not be clear ahead of time which covariates to use for making adjustments, or if covariate adjustment will even be helpful. To address this concern, we propose a flexible covariate adjustment method, the LOOP ("Leave-One-Out Potential outcomes") estimator. This method allows for automatic variable selection, so that we do not need to know ahead of time which variables to use. In addition, the method is unbiased under the Neyman-Rubin model and generally performs at least as well as the unadjusted estimator. This alleviates concerns that the adjustment could harm the performance of the treatment effect estimate. Covariate imbalance can also be addressed using study design. In paired experiments, participants are grouped into pairs with similar characteristics, and one observation from each pair is randomly assigned to treatment. While this study design is often successful in balancing the treatment and control groups, it may still be possible to improve precision using covariate adjustment. We build on the LOOP estimator and propose a design-based covariate adjustment method for paired experiments. This method addresses a unique trade-off that exists for paired experiments, where it can be unclear the extent to which account for the paired structure. By addressing this trade-off, the method has the potential to improve over existing methods. Modern randomized experiments may be accompanied by a large amount of auxiliary data, such as related observational data. Sample sizes of randomized experiments are often limited due to practical constraints. However, sample sizes for the auxiliary data can be large. We propose a covariate adjustment method that allows us to use observational data sets to make adjustments to the experimental data without bias from confounding variables leaking into our analysis. Our method also adjusts for the covariates within the randomized experiment itself, and automatically interpolates between the adjustment made using the experimental covariates and the observational data set. Finally, we propose a method for high-dimensional classification. In this method, we have the predictors in a data set compete in a "tournament" until they have been combined into single predictor. From a computation perspective, this method is a natural fit to be used within the LOOP estimator when the outcome is binary; however, it can also be used more generally. The method shares several of the features used within the covariate adjustment methods, such as the use of a leave-one-out procedure to improve performance and interpolation between competing predictors.PHDStatisticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/169869/1/jameswu_1.pd
    corecore