50,829 research outputs found

    Estimation of treatment effects in observational studies by recovering the assignment probabilities and the population model

    No full text
    In observational studies the assignment of units to treatments is with unknown probabilities. Consequently, estimation and comparison of treatment effects based on the empirical distributions of the response under the various treatments can be biased since units exposed to one treatment could differ in important but unknown characteristics from units exposed to other treatments. In this article we study the plausibility of analyzing observational data by deriving the parametric distribution of the observed response under a given treatment as a function of the distribution that would be obtained under a strongly ignorable assignment, and the assignment process, which is modeled as a function of the observed data (the response and covariate values). The use of this approach is founded by showing that the sample distribution of the observed responses is identifiable under some general conditions. The goodness of fit of this distribution can be tested by using standard test statistics since it refers to the observed data, but we also develop a new test. The proposed approach allows also testing the assumptions underlying the use of methods that employ instrumental variables, or methods that use propensity scores with a given set of covariates.We assess the performance of the proposed approach and compare it to existing approaches using data collected in the year 2000 by OECD for the Programme for International Student Assessment (PISA). In the present application we compare studentsā€™ scores in mathematics between public and private schools in Ireland and conclude, somewhat surprisingly, that the public schools perform better than the private schools. This finding is supported by one of the existing methods as well

    Estimating spillovers using imprecisely measured networks

    Full text link
    In many experimental contexts, whether and how network interactions impact the outcome of interest for both treated and untreated individuals are key concerns. Networks data is often assumed to perfectly represent these possible interactions. This paper considers the problem of estimating treatment effects when measured connections are, instead, a noisy representation of the true spillover pathways. We show that existing methods, using the potential outcomes framework, yield biased estimators in the presence of this mismeasurement. We develop a new method, using a class of mixture models, that can account for missing connections and discuss its estimation via the Expectation-Maximization algorithm. We check our method's performance by simulating experiments on real network data from 43 villages in India. Finally, we use data from a previously published study to show that estimates using our method are more robust to the choice of network measure

    Meta-learners for Estimating Heterogeneous Treatment Effects using Machine Learning

    Get PDF
    There is growing interest in estimating and analyzing heterogeneous treatment effects in experimental and observational studies. We describe a number of meta-algorithms that can take advantage of any supervised learning or regression method in machine learning and statistics to estimate the Conditional Average Treatment Effect (CATE) function. Meta-algorithms build on base algorithms---such as Random Forests (RF), Bayesian Additive Regression Trees (BART) or neural networks---to estimate the CATE, a function that the base algorithms are not designed to estimate directly. We introduce a new meta-algorithm, the X-learner, that is provably efficient when the number of units in one treatment group is much larger than in the other, and can exploit structural properties of the CATE function. For example, if the CATE function is linear and the response functions in treatment and control are Lipschitz continuous, the X-learner can still achieve the parametric rate under regularity conditions. We then introduce versions of the X-learner that use RF and BART as base learners. In extensive simulation studies, the X-learner performs favorably, although none of the meta-learners is uniformly the best. In two persuasion field experiments from political science, we demonstrate how our new X-learner can be used to target treatment regimes and to shed light on underlying mechanisms. A software package is provided that implements our methods

    Integer polyhedra for program analysis

    Get PDF
    Polyhedra are widely used in model checking and abstract interpretation. Polyhedral analysis is effective when the relationships between variables are linear, but suffers from imprecision when it is necessary to take into account the integrality of the represented space. Imprecision also arises when non-linear constraints occur. Moreover, in terms of tractability, even a space defined by linear constraints can become unmanageable owing to the excessive number of inequalities. Thus it is useful to identify those inequalities whose omission has least impact on the represented space. This paper shows how these issues can be addressed in a novel way by growing the integer hull of the space and approximating the number of integral points within a bounded polyhedron

    Stratification Trees for Adaptive Randomization in Randomized Controlled Trials

    Full text link
    This paper proposes an adaptive randomization procedure for two-stage randomized controlled trials. The method uses data from a first-wave experiment in order to determine how to stratify in a second wave of the experiment, where the objective is to minimize the variance of an estimator for the average treatment effect (ATE). We consider selection from a class of stratified randomization procedures which we call stratification trees: these are procedures whose strata can be represented as decision trees, with differing treatment assignment probabilities across strata. By using the first wave to estimate a stratification tree, we simultaneously select which covariates to use for stratification, how to stratify over these covariates, as well as the assignment probabilities within these strata. Our main result shows that using this randomization procedure with an appropriate estimator results in an asymptotic variance which is minimal in the class of stratification trees. Moreover, the results we present are able to accommodate a large class of assignment mechanisms within strata, including stratified block randomization. In a simulation study, we find that our method, paired with an appropriate cross-validation procedure ,can improve on ad-hoc choices of stratification. We conclude by applying our method to the study in Karlan and Wood (2017), where we estimate stratification trees using the first wave of their experiment
    • ā€¦