1,695 research outputs found
Causal inference taking into account unobserved confounding
Causal inference with observational data can be performed under an assumption
of no unobserved confounders (unconfoundedness assumption). There is, however,
seldom clear subject-matter or empirical evidence for such an assumption. We
therefore develop uncertainty intervals for average causal effects based on
outcome regression estimators and doubly robust estimators, which provide
inference taking into account both sampling variability and uncertainty due to
unobserved confounders. In contrast with sampling variation, uncertainty due
unobserved confounding does not decrease with increasing sample size. The
intervals introduced are obtained by deriving the bias of the estimators due to
unobserved confounders. We are thus also able to contrast the size of the bias
due to violation of the unconfoundedness assumption, with bias due to
misspecification of the models used to explain potential outcomes. This is
illustrated through numerical experiments where bias due to moderate unobserved
confounding dominates misspecification bias for typical situations in terms of
sample size and modeling assumptions. We also study the empirical coverage of
the uncertainty intervals introduced and apply the results to a study of the
effect of regular food intake on health. An R-package implementing the
inference proposed is available.Comment: Biometrics 201
Bootstrap inference for K-nearest neighbour matching estimators
Abadie and Imbens (2008, Econometrica) showed that classical bootstrap schemes fail to provide correct inference for K-nearest neighbour (KNN) matching estimators of average causal effects. This is an interesting result showing that bootstrap should not be applied without theoretical justification. In this paper, we present two resampling schemes, which we show provide valid inference for KNN matching estimators. We resample "estimated individual causal effects" (EICE), i.e. the difference in outcome between matched pairs, instead of the original data. Moreover, by taking differences in EICEs ordered with respect to the matching covariate, we obtain a bootstrap scheme valid also with heterogeneous causal effects where mild assumptions on the heterogeneity are imposed. We provide proofs of the validity of the proposed resampling based inferences. A simulation study illustrates finite sample properties.Block bootstrap; subsampling; average causal/treatment effect
Bootstrap Inference for K-Nearest Neighbour Matching Estimators
Abadie and Imbens (2008, Econometrica) showed that classical bootstrap schemes fail to provide correct inference for K-nearest neighbour (KNN) matching estimators of average causal effects. This is an interesting result showing that bootstrap should not be applied without theoretical justification. In this paper, we present two resampling schemes, which we show provide valid inference for KNN matching estimators. We resample "estimated individual causal effects" (EICE), i.e. the difference in outcome between matched pairs, instead of the original data. Moreover, by taking differences in EICEs ordered with respect to the matching covariate, we obtain a bootstrap scheme valid also with heterogeneous causal effects where mild assumptions on the heterogeneity are imposed. We provide proofs of the validity of the proposed resampling based inferences. A simulation study illustrates finite sample properties.block bootstrap, subsampling, average causal/treatment effect
Non-Parametric Inference for the Effect of a Treatment on Survival Times with Application in the Health and Social Sciences
In this paper we perform inference on the effect of a treatment on survival times in studies where the treatment assignment is not randomized and the assignment time is not known in advance. Two such studies are discussed: a heart transplant program and a study of Swedish unemployed eligible for employment subsidy. We estimate survival functions on a treated and a control group which are made comparable through matching on observed covariates. The inference is performed by conditioning on waiting time to treatment, that is time between the entrance in the study and treatment. This can be done only when sufficient data is available. In other cases, averaging over waiting times is a possibility, although the classical interpretation of the estimated survival functions is lost unless hazards are not functions of waiting time. To show unbiasedness and to obtain an estimator of the variance, we build on the potential outcome framework, which was introduced by J. Neyman in the context of randomized experiments, and adapted to observational studies by D. B. Rubin. Our approach does not make parametric or distributional assumptions. In particular, we do not assume proportionality of the hazards compared. Small sample performance of the estimator and a derived test of no treatment effect are studied in a Monte Carlo study.potential outcome, observational study, matching estimator, heart transplant, employment subsidy, survival function
Matching estimators for the effect of a treatment on survival times
We perform inference on the effect of a treatment on survival times in studies where the treatment assignment is not randomized and the assignment time is not known in advance. We estimate survival functions on a treated and a control group which are made comparable through matching on observed covariates. The inference is performed by conditioning on waiting time to treatment, that is time between the entrance in the study and treatment. This can be done only when sufficient data is available. In other cases, averaging over waiting times is a possibility, although the classical interpretation of the estimated survival functions is lost unless hazards are not functions of the waiting times. To show unbiasedness and to obtain an estimator of the variance, we build on the potential outcome framework, which was introduced by J. Neyman in the context of randomized experiments, and adapted to observational studies by D. B. Rubin. Our approach does not make parametric or distributional assumptions. In particular, we do not assume proportionality of the hazards compared. Small sample performance of the estimator and a derived test of no treatment effect are studied in a Monte Carlo study.Effect of a treatment; treatment
Sensitivity analysis of the unconfoundedness assumption in observational studies
In observational studies, the estimation of a treatment effect on an outcome of interest is often done by controlling on a set of pre-treatment characteristics (covariates). This yields an unbiased estimator of the treatment effect when the assumption of unconfoundedness holds, that is, there are no unobserved covariates affecting both the treatment assignment and the outcome. This is in general not realistically testable. It is, therefore, important to conduct an analysis about how sensitive the inference is with respect to the unconfoundedness assumption. In this paper we propose a procedure to conduct such a Bayesian sensitivity analysis, where the usual parameter uncertainty and the uncertainty due to the unconfoundedness assumption can be compared. To measure departures from the assumption we use a correlation coefficient which is intuitively comprehensible and ensures that the results of sensitivity analyses made on different evaluation studies are comparable. Our procedure is applied to the Lalonde data and to a study of the effect of college choice on income in Sweden.Causal inference; effects of college choice; propensity score; register data
Non-parametric adjustment for covariates when estimating a treatment effect
We consider a non-parametric model for estimating the effect of a binary treatment on an outcome variable while adjusting for an observed covariate. A naive procedure consists in performing two separate non-parametric regression of the response on the covariate: one with the treated individuals and the other with the untreated. The treatment effect is then obtained by taking the difference between the two fitted regression functions. This paper proposes a backfitting algorithm which uses all the data for the two above-mentioned non-parametric regression. We give theoretical results showing that the resulting estimator of the treatment effect can have lower finite sample variance. This improvement may be achieved at the cost of a larger bias. However, in a simulation study we observe that mean squared error is lowest for the proposed backfitting estimator. When more than one covariate is observed our backfitting estimator can still be applied by using the propensity score (probability of being treated for a given setup of the covariates). We illustrate the use of the backfitting estimator in a several covariate situation with data on a training program for individuals having faced social and economic problems.Analysis of covariance; backfitting algorithm; linear smoothers; propensity score
Testing exogeneity under distributional misspecification
We propose a general test for exogeneity that is robust against distributional misspecification. The test can also be used to identify other types of misspecifications, such as the presence of a random coefficient. The idea is to sort the data with respect to a variable (a sorting score) and then split the sample into two parts. Using a Chow test, it can then be tested whether estimated parameters in the two sub-samples are different. We give conditions under which it is possible to test for exogeneity by using the (supposedly) endogenous variable as a sorting score. The resulting test does not need instrumental variables. Evidence from a Monte Carlo study and an empirical application suggets that the test can be useful for practitioners.Absenteeism at work; endogeneity; linear exponential family; random effect; random coefficient; selectivity
- …