50,829 research outputs found
Estimation of treatment effects in observational studies by recovering the assignment probabilities and the population model
In observational studies the assignment of units to treatments is with unknown probabilities. Consequently, estimation and comparison of treatment effects based on the empirical distributions of the response under the various treatments can be biased since units exposed to one treatment could differ in important but unknown characteristics from units exposed to other treatments. In this article we study the plausibility of analyzing observational data by deriving the parametric distribution of the observed response under a given treatment as a function of the distribution that would be obtained under a strongly ignorable assignment, and the assignment process, which is modeled as a function of the observed data (the response and covariate values). The use of this approach is founded by showing that the sample distribution of the observed responses is identifiable under some general conditions. The goodness of fit of this distribution can be tested by using standard test statistics since it refers to the observed data, but we also develop a new test. The proposed approach allows also testing the assumptions underlying the use of methods that employ instrumental variables, or methods that use propensity scores with a given set of covariates.We assess the performance of the proposed approach and compare it to existing approaches using data collected in the year 2000 by OECD for the Programme for International Student Assessment (PISA). In the present application we compare studentsā scores in mathematics between public and private schools in Ireland and conclude, somewhat surprisingly, that the public schools perform better than the private schools. This finding is supported by one of the existing methods as well
Estimating spillovers using imprecisely measured networks
In many experimental contexts, whether and how network interactions impact
the outcome of interest for both treated and untreated individuals are key
concerns. Networks data is often assumed to perfectly represent these possible
interactions. This paper considers the problem of estimating treatment effects
when measured connections are, instead, a noisy representation of the true
spillover pathways. We show that existing methods, using the potential outcomes
framework, yield biased estimators in the presence of this mismeasurement. We
develop a new method, using a class of mixture models, that can account for
missing connections and discuss its estimation via the Expectation-Maximization
algorithm. We check our method's performance by simulating experiments on real
network data from 43 villages in India. Finally, we use data from a previously
published study to show that estimates using our method are more robust to the
choice of network measure
Meta-learners for Estimating Heterogeneous Treatment Effects using Machine Learning
There is growing interest in estimating and analyzing heterogeneous treatment
effects in experimental and observational studies. We describe a number of
meta-algorithms that can take advantage of any supervised learning or
regression method in machine learning and statistics to estimate the
Conditional Average Treatment Effect (CATE) function. Meta-algorithms build on
base algorithms---such as Random Forests (RF), Bayesian Additive Regression
Trees (BART) or neural networks---to estimate the CATE, a function that the
base algorithms are not designed to estimate directly. We introduce a new
meta-algorithm, the X-learner, that is provably efficient when the number of
units in one treatment group is much larger than in the other, and can exploit
structural properties of the CATE function. For example, if the CATE function
is linear and the response functions in treatment and control are Lipschitz
continuous, the X-learner can still achieve the parametric rate under
regularity conditions. We then introduce versions of the X-learner that use RF
and BART as base learners. In extensive simulation studies, the X-learner
performs favorably, although none of the meta-learners is uniformly the best.
In two persuasion field experiments from political science, we demonstrate how
our new X-learner can be used to target treatment regimes and to shed light on
underlying mechanisms. A software package is provided that implements our
methods
Integer polyhedra for program analysis
Polyhedra are widely used in model checking and abstract interpretation. Polyhedral analysis is effective when the relationships between variables are linear, but suffers from imprecision when it is necessary to take into account the integrality of the represented space. Imprecision also arises when non-linear constraints occur. Moreover, in terms of tractability, even a space defined by linear constraints can become unmanageable owing to the excessive number of inequalities. Thus it is useful to identify those inequalities whose omission has least impact on the represented space. This paper shows how these issues can be addressed in a novel way by growing the integer hull of the space and approximating the number of integral points within a bounded polyhedron
Stratification Trees for Adaptive Randomization in Randomized Controlled Trials
This paper proposes an adaptive randomization procedure for two-stage
randomized controlled trials. The method uses data from a first-wave experiment
in order to determine how to stratify in a second wave of the experiment, where
the objective is to minimize the variance of an estimator for the average
treatment effect (ATE). We consider selection from a class of stratified
randomization procedures which we call stratification trees: these are
procedures whose strata can be represented as decision trees, with differing
treatment assignment probabilities across strata. By using the first wave to
estimate a stratification tree, we simultaneously select which covariates to
use for stratification, how to stratify over these covariates, as well as the
assignment probabilities within these strata. Our main result shows that using
this randomization procedure with an appropriate estimator results in an
asymptotic variance which is minimal in the class of stratification trees.
Moreover, the results we present are able to accommodate a large class of
assignment mechanisms within strata, including stratified block randomization.
In a simulation study, we find that our method, paired with an appropriate
cross-validation procedure ,can improve on ad-hoc choices of stratification. We
conclude by applying our method to the study in Karlan and Wood (2017), where
we estimate stratification trees using the first wave of their experiment
- ā¦