295 research outputs found
Wasserstein Random Forests and Applications in Heterogeneous Treatment Effects
We present new insights into causal inference in the context of Heterogeneous
Treatment Effects by proposing natural variants of Random Forests to estimate
the key conditional distributions. To achieve this, we recast Breiman's
original splitting criterion in terms of Wasserstein distances between
empirical measures. This reformulation indicates that Random Forests are well
adapted to estimate conditional distributions and provides a natural extension
of the algorithm to multivariate outputs. Following the philosophy of Breiman's
construction, we propose some variants of the splitting rule that are
well-suited to the conditional distribution estimation problem. Some
preliminary theoretical connections are established along with various
numerical experiments, which show how our approach may help to conduct more
transparent causal inference in complex situations
Estimating individual treatment effect: generalization bounds and algorithms
There is intense interest in applying machine learning to problems of causal
inference in fields such as healthcare, economics and education. In particular,
individual-level causal inference has important applications such as precision
medicine. We give a new theoretical analysis and family of algorithms for
predicting individual treatment effect (ITE) from observational data, under the
assumption known as strong ignorability. The algorithms learn a "balanced"
representation such that the induced treated and control distributions look
similar. We give a novel, simple and intuitive generalization-error bound
showing that the expected ITE estimation error of a representation is bounded
by a sum of the standard generalization-error of that representation and the
distance between the treated and control distributions induced by the
representation. We use Integral Probability Metrics to measure distances
between distributions, deriving explicit bounds for the Wasserstein and Maximum
Mean Discrepancy (MMD) distances. Experiments on real and simulated data show
the new algorithms match or outperform the state-of-the-art.Comment: Added name "TARNet" to refer to version with alpha = 0. Removed sup
Learning Counterfactual Representations for Estimating Individual Dose-Response Curves
Estimating what would be an individual's potential response to varying levels
of exposure to a treatment is of high practical relevance for several important
fields, such as healthcare, economics and public policy. However, existing
methods for learning to estimate counterfactual outcomes from observational
data are either focused on estimating average dose-response curves, or limited
to settings with only two treatments that do not have an associated dosage
parameter. Here, we present a novel machine-learning approach towards learning
counterfactual representations for estimating individual dose-response curves
for any number of treatments with continuous dosage parameters with neural
networks. Building on the established potential outcomes framework, we
introduce performance metrics, model selection criteria, model architectures,
and open benchmarks for estimating individual dose-response curves. Our
experiments show that the methods developed in this work set a new
state-of-the-art in estimating individual dose-response
A Survey of Contextual Optimization Methods for Decision Making under Uncertainty
Recently there has been a surge of interest in operations research (OR) and
the machine learning (ML) community in combining prediction algorithms and
optimization techniques to solve decision-making problems in the face of
uncertainty. This gave rise to the field of contextual optimization, under
which data-driven procedures are developed to prescribe actions to the
decision-maker that make the best use of the most recently updated information.
A large variety of models and methods have been presented in both OR and ML
literature under a variety of names, including data-driven optimization,
prescriptive optimization, predictive stochastic programming, policy
optimization, (smart) predict/estimate-then-optimize, decision-focused
learning, (task-based) end-to-end learning/forecasting/optimization, etc.
Focusing on single and two-stage stochastic programming problems, this review
article identifies three main frameworks for learning policies from data and
discusses their strengths and limitations. We present the existing models and
methods under a uniform notation and terminology and classify them according to
the three main frameworks identified. Our objective with this survey is to both
strengthen the general understanding of this active field of research and
stimulate further theoretical and algorithmic advancements in integrating ML
and stochastic programming
Distributionally Robust Machine Learning with Multi-source Data
Classical machine learning methods may lead to poor prediction performance
when the target distribution differs from the source populations. This paper
utilizes data from multiple sources and introduces a group distributionally
robust prediction model defined to optimize an adversarial reward about
explained variance with respect to a class of target distributions. Compared to
classical empirical risk minimization, the proposed robust prediction model
improves the prediction accuracy for target populations with distribution
shifts. We show that our group distributionally robust prediction model is a
weighted average of the source populations' conditional outcome models. We
leverage this key identification result to robustify arbitrary machine learning
algorithms, including, for example, random forests and neural networks. We
devise a novel bias-corrected estimator to estimate the optimal aggregation
weight for general machine-learning algorithms and demonstrate its improvement
in the convergence rate. Our proposal can be seen as a distributionally robust
federated learning approach that is computationally efficient and easy to
implement using arbitrary machine learning base algorithms, satisfies some
privacy constraints, and has a nice interpretation of different sources'
importance for predicting a given target covariate distribution. We demonstrate
the performance of our proposed group distributionally robust method on
simulated and real data with random forests and neural networks as
base-learning algorithms
Deep Learning of Potential Outcomes
This review systematizes the emerging literature for causal inference using
deep neural networks under the potential outcomes framework. It provides an
intuitive introduction on how deep learning can be used to estimate/predict
heterogeneous treatment effects and extend causal inference to settings where
confounding is non-linear, time varying, or encoded in text, networks, and
images. To maximize accessibility, we also introduce prerequisite concepts from
causal inference and deep learning. The survey differs from other treatments of
deep learning and causal inference in its sharp focus on observational causal
estimation, its extended exposition of key algorithms, and its detailed
tutorials for implementing, training, and selecting among deep estimators in
Tensorflow 2 available at github.com/kochbj/Deep-Learning-for-Causal-Inference
- âŠ