Search CORE

298 research outputs found

Entropy balancing is doubly robust

Author: Percival Daniel
Zhao Qingyuan
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 01/11/2016
Field of study

Covariate balance is a conventional key diagnostic for methods used estimating causal effects from observational studies. Recently, there is an emerging interest in directly incorporating covariate balance in the estimation. We study a recently proposed entropy maximization method called Entropy Balancing (EB), which exactly matches the covariate moments for the different experimental groups in its optimization problem. We show EB is doubly robust with respect to linear outcome regression and logistic propensity score regression, and it reaches the asymptotic semiparametric variance bound when both regressions are correctly specified. This is surprising to us because there is no attempt to model the outcome or the treatment assignment in the original proposal of EB. Our theoretical results and simulations suggest that EB is a very appealing alternative to the conventional weighting estimators that estimate the propensity score by maximum likelihood.Comment: 23 pages, 6 figures, Journal of Causal Inference 201

arXiv.org e-Print Archive

Directory of Open Access Journals

Cross-screening in observational studies that test many hypotheses

Author: Rosenbaum Paul R.
Small Dylan S.
Zhao Qingyuan
Publication venue
Publication date: 06/03/2017
Field of study

We discuss observational studies that test many causal hypotheses, either hypotheses about many outcomes or many treatments. To be credible an observational study that tests many causal hypotheses must demonstrate that its conclusions are neither artifacts of multiple testing nor of small biases from nonrandom treatment assignment. In a sense that needs to be defined carefully, hidden within a sensitivity analysis for nonrandom assignment is an enormous correction for multiple testing: in the absence of bias, it is extremely improbable that multiple testing alone would create an association insensitive to moderate biases. We propose a new strategy called "cross-screening", different from but motivated by recent work of Bogomolov and Heller on replicability. Cross-screening splits the data in half at random, uses the first half to plan a study carried out on the second half, then uses the second half to plan a study carried out on the first half, and reports the more favorable conclusions of the two studies correcting using the Bonferroni inequality for having done two studies. If the two studies happen to concur, then they achieve Bogomolov-Heller replicability; however, importantly, replicability is not required for strong control of the family-wise error rate, and either study alone suffices for firm conclusions. In randomized studies with a few hypotheses, cross-split screening is not an attractive method when compared with conventional methods of multiplicity control, but it can become attractive when hundreds or thousands of hypotheses are subjected to sensitivity analyses in an observational study. We illustrate the technique by comparing 46 biomarkers in individuals who consume large quantities of fish versus little or no fish.Comment: 33 pages, 2 figures, 5 table

arXiv.org e-Print Archive

FigShare

Multiple conditional randomization tests

Author: Zhang Yao
Zhao Qingyuan
Publication venue
Publication date: 07/10/2022
Field of study

We establish a general sufficient condition on constructing multiple "nearly independent" conditional randomization tests, in the sense that the joint distribution of their p-values is almost uniform under the global null. This property implies that the tests are jointly valid and can be combined using standard methods. Our theory generalizes existing techniques in the literature that use independent treatments, sequential treatments, or post-randomization, to construct multiple randomization tests. In particular, it places no condition on the experimental design, allowing for arbitrary treatment variables, assignment mechanisms and unit interference. The flexibility of this framework is illustrated through developing conditional randomization tests for lagged treatment effects in stepped-wedge randomized controlled trials. A weighted Z-score test is further proposed to maximize the power when the tests are combined. We compare the efficiency and robustness of the commonly used mixed-effect models and the proposed conditional randomization tests using simulated experiments and real trial data.Comment: 34 pages; Part of the original version of this paper can be found at arXiv:2203.1098

arXiv.org e-Print Archive