109,243 research outputs found
Estimating individual treatment effect: generalization bounds and algorithms
There is intense interest in applying machine learning to problems of causal
inference in fields such as healthcare, economics and education. In particular,
individual-level causal inference has important applications such as precision
medicine. We give a new theoretical analysis and family of algorithms for
predicting individual treatment effect (ITE) from observational data, under the
assumption known as strong ignorability. The algorithms learn a "balanced"
representation such that the induced treated and control distributions look
similar. We give a novel, simple and intuitive generalization-error bound
showing that the expected ITE estimation error of a representation is bounded
by a sum of the standard generalization-error of that representation and the
distance between the treated and control distributions induced by the
representation. We use Integral Probability Metrics to measure distances
between distributions, deriving explicit bounds for the Wasserstein and Maximum
Mean Discrepancy (MMD) distances. Experiments on real and simulated data show
the new algorithms match or outperform the state-of-the-art.Comment: Added name "TARNet" to refer to version with alpha = 0. Removed sup
Residual Weighted Learning for Estimating Individualized Treatment Rules
Personalized medicine has received increasing attention among statisticians,
computer scientists, and clinical practitioners. A major component of
personalized medicine is the estimation of individualized treatment rules
(ITRs). Recently, Zhao et al. (2012) proposed outcome weighted learning (OWL)
to construct ITRs that directly optimize the clinical outcome. Although OWL
opens the door to introducing machine learning techniques to optimal treatment
regimes, it still has some problems in performance. In this article, we propose
a general framework, called Residual Weighted Learning (RWL), to improve finite
sample performance. Unlike OWL which weights misclassification errors by
clinical outcomes, RWL weights these errors by residuals of the outcome from a
regression fit on clinical covariates excluding treatment assignment. We
utilize the smoothed ramp loss function in RWL, and provide a difference of
convex (d.c.) algorithm to solve the corresponding non-convex optimization
problem. By estimating residuals with linear models or generalized linear
models, RWL can effectively deal with different types of outcomes, such as
continuous, binary and count outcomes. We also propose variable selection
methods for linear and nonlinear rules, respectively, to further improve the
performance. We show that the resulting estimator of the treatment rule is
consistent. We further obtain a rate of convergence for the difference between
the expected outcome using the estimated ITR and that of the optimal treatment
rule. The performance of the proposed RWL methods is illustrated in simulation
studies and in an analysis of cystic fibrosis clinical trial data.Comment: 48 pages, 3 figure
- …