629 research outputs found
Causal Inference Through Potential Outcomes and Principal Stratification: Application to Studies with "Censoring" Due to Death
Causal inference is best understood using potential outcomes. This use is
particularly important in more complex settings, that is, observational studies
or randomized experiments with complications such as noncompliance. The topic
of this lecture, the issue of estimating the causal effect of a treatment on a
primary outcome that is ``censored'' by death, is another such complication.
For example, suppose that we wish to estimate the effect of a new drug on
Quality of Life (QOL) in a randomized experiment, where some of the patients
die before the time designated for their QOL to be assessed. Another example
with the same structure occurs with the evaluation of an educational program
designed to increase final test scores, which are not defined for those who
drop out of school before taking the test. A further application is to studies
of the effect of job-training programs on wages, where wages are only defined
for those who are employed. The analysis of examples like these is greatly
clarified using potential outcomes to define causal effects, followed by
principal stratification on the intermediated outcomes (e.g., survival).Comment: This paper commented in: [math.ST/0612785], [math.ST/0612786],
[math.ST/0612788]. Rejoinder in [math.ST/0612789]. Published at
http://dx.doi.org/10.1214/088342306000000114 in the Statistical Science
(http://www.imstat.org/sts/) by the Institute of Mathematical Statistics
(http://www.imstat.org
Rerandomization to improve covariate balance in experiments
Randomized experiments are the "gold standard" for estimating causal effects,
yet often in practice, chance imbalances exist in covariate distributions
between treatment groups. If covariate data are available before units are
exposed to treatments, these chance imbalances can be mitigated by first
checking covariate balance before the physical experiment takes place. Provided
a precise definition of imbalance has been specified in advance, unbalanced
randomizations can be discarded, followed by a rerandomization, and this
process can continue until a randomization yielding balance according to the
definition is achieved. By improving covariate balance, rerandomization
provides more precise and trustworthy estimates of treatment effects.Comment: Published in at http://dx.doi.org/10.1214/12-AOS1008 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Estimating the Causal Effects of Marketing Interventions Using Propensity Score Methodology
Propensity score methods were proposed by Rosenbaum and Rubin [Biometrika 70
(1983) 41--55] as central tools to help assess the causal effects of
interventions. Since their introduction more than two decades ago, they have
found wide application in a variety of areas, including medical research,
economics, epidemiology and education, especially in those situations where
randomized experiments are either difficult to perform, or raise ethical
questions, or would require extensive delays before answers could be obtained.
In the past few years, the number of published applications using propensity
score methods to evaluate medical and epidemiological interventions has
increased dramatically. Nevertheless, thus far, we believe that there have been
few applications of propensity score methods to evaluate marketing
interventions (e.g., advertising, promotions), where the tradition is to use
generally inappropriate techniques, which focus on the prediction of an outcome
from background characteristics and an indicator for the intervention using
statistical tools such as least-squares regression, data mining, and so on.
With these techniques, an estimated parameter in the model is used to estimate
some global ``causal'' effect. This practice can generate grossly incorrect
answers that can be self-perpetuating: polishing the Ferraris rather than the
Jeeps ``causes'' them to continue to win more races than the Jeeps
visiting the high-prescribing doctors rather than the
low-prescribing doctors ``causes'' them to continue to write more
prescriptions. This presentation will take ``causality'' seriously, not just as
a casual concept implying some predictive association in a data set, and will
illustrate why propensity score methods are generally superior in practice to
the standard predictive approaches for estimating causal effects.Comment: Published at http://dx.doi.org/10.1214/088342306000000259 in the
Statistical Science (http://www.imstat.org/sts/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Affinely invariant matching methods with discriminant mixtures of proportional ellipsoidally symmetric distributions
In observational studies designed to estimate the effects of interventions or
exposures, such as cigarette smoking, it is desirable to try to control
background differences between the treated group (e.g., current smokers) and
the control group (e.g., never smokers) on covariates (e.g., age,
education). Matched sampling attempts to effect this control by selecting
subsets of the treated and control groups with similar distributions of such
covariates. This paper examines the consequences of matching using affinely
invariant methods when the covariate distributions are ``discriminant mixtures
of proportional ellipsoidally symmetric'' (DMPES) distributions, a class herein
defined, which generalizes the ellipsoidal symmetry class of Rubin and Thomas
[Ann. Statist. 20 (1992) 1079--1093]. The resulting generalized results help
indicate why earlier results hold quite well even when the simple assumption of
ellipsoidal symmetry is not met [e.g., Biometrics 52 (1996) 249--264].
Extensions to conditionally affinely invariant matching with conditionally
DMPES distributions are also discussed.Comment: Published at http://dx.doi.org/10.1214/009053606000000407 in the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Asymptotic Theory of Rerandomization in Treatment-Control Experiments
Although complete randomization ensures covariate balance on average, the
chance for observing significant differences between treatment and control
covariate distributions increases with many covariates. Rerandomization
discards randomizations that do not satisfy a predetermined covariate balance
criterion, generally resulting in better covariate balance and more precise
estimates of causal effects. Previous theory has derived finite sample theory
for rerandomization under the assumptions of equal treatment group sizes,
Gaussian covariate and outcome distributions, or additive causal effects, but
not for the general sampling distribution of the difference-in-means estimator
for the average causal effect. To supplement existing results, we develop
asymptotic theory for rerandomization without these assumptions, which reveals
a non-Gaussian asymptotic distribution for this estimator, specifically a
linear combination of a Gaussian random variable and a truncated Gaussian
random variable. This distribution follows because rerandomization affects only
the projection of potential outcomes onto the covariate space but does not
affect the corresponding orthogonal residuals. We also demonstrate that,
compared to complete randomization, rerandomization reduces the asymptotic
sampling variances and quantile ranges of the difference-in-means estimator.
Moreover, our work allows the construction of accurate large-sample confidence
intervals for the average causal effect, thereby revealing further advantages
of rerandomization over complete randomization
For objective causal inference, design trumps analysis
For obtaining causal inferences that are objective, and therefore have the
best chance of revealing scientific truths, carefully designed and executed
randomized experiments are generally considered to be the gold standard.
Observational studies, in contrast, are generally fraught with problems that
compromise any claim for objectivity of the resulting causal inferences. The
thesis here is that observational studies have to be carefully designed to
approximate randomized experiments, in particular, without examining any final
outcome data. Often a candidate data set will have to be rejected as inadequate
because of lack of data on key covariates, or because of lack of overlap in the
distributions of key covariates between treatment and control groups, often
revealed by careful propensity score analyses. Sometimes the template for the
approximating randomized experiment will have to be altered, and the use of
principal stratification can be helpful in doing this. These issues are
discussed and illustrated using the framework of potential outcomes to define
causal effects, which greatly clarifies critical issues.Comment: Published in at http://dx.doi.org/10.1214/08-AOAS187 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
- …