8 research outputs found
Counterfactual Generation Under Confounding
A machine learning model, under the influence of observed or unobserved
confounders in the training data, can learn spurious correlations and fail to
generalize when deployed. For image classifiers, augmenting a training dataset
using counterfactual examples has been empirically shown to break spurious
correlations. However, the counterfactual generation task itself becomes more
difficult as the level of confounding increases. Existing methods for
counterfactual generation under confounding consider a fixed set of
interventions (e.g., texture, rotation) and are not flexible enough to capture
diverse data-generating processes. Given a causal generative process, we
formally characterize the adverse effects of confounding on any downstream
tasks and show that the correlation between generative factors (attributes) can
be used to quantitatively measure confounding between generative factors. To
minimize such correlation, we propose a counterfactual generation method that
learns to modify the value of any attribute in an image and generate new images
given a set of observed attributes, even when the dataset is highly confounded.
These counterfactual images are then used to regularize the downstream
classifier such that the learned representations are the same across various
generative factors conditioned on the class label. Our method is
computationally efficient, simple to implement, and works well for any number
of generative factors and confounding variables. Our experimental results on
both synthetic (MNIST variants) and real-world (CelebA) datasets show the
usefulness of our approach
Rethinking Counterfactual Data Augmentation Under Confounding
Counterfactual data augmentation has recently emerged as a method to mitigate
confounding biases in the training data for a machine learning model. These
biases, such as spurious correlations, arise due to various observed and
unobserved confounding variables in the data generation process. In this paper,
we formally analyze how confounding biases impact downstream classifiers and
present a causal viewpoint to the solutions based on counterfactual data
augmentation. We explore how removing confounding biases serves as a means to
learn invariant features, ultimately aiding in generalization beyond the
observed data distribution. Additionally, we present a straightforward yet
powerful algorithm for generating counterfactual images, which effectively
mitigates the influence of confounding effects on downstream classifiers.
Through experiments on MNIST variants and the CelebA datasets, we demonstrate
the effectiveness and practicality of our approach
Reliable Off-Policy Learning for Dosage Combinations
Decision-making in personalized medicine such as cancer therapy or critical
care must often make choices for dosage combinations, i.e., multiple continuous
treatments. Existing work for this task has modeled the effect of multiple
treatments independently, while estimating the joint effect has received little
attention but comes with non-trivial challenges. In this paper, we propose a
novel method for reliable off-policy learning for dosage combinations. Our
method proceeds along three steps: (1) We develop a tailored neural network
that estimates the individualized dose-response function while accounting for
the joint effect of multiple dependent dosages. (2) We estimate the generalized
propensity score using conditional normalizing flows in order to detect regions
with limited overlap in the shared covariate-treatment space. (3) We present a
gradient-based learning algorithm to find the optimal, individualized dosage
combinations. Here, we ensure reliable estimation of the policy value by
avoiding regions with limited overlap. We finally perform an extensive
evaluation of our method to show its effectiveness. To the best of our
knowledge, ours is the first work to provide a method for reliable off-policy
learning for optimal dosage combinations.Comment: Accepted at NeurIPS 202
Causal Effect Inference for Structured Treatments
We address the estimation of conditional average treatment effects (CATEs)
for structured treatments (e.g., graphs, images, texts). Given a weak condition
on the effect, we propose the generalized Robinson decomposition, which (i)
isolates the causal estimand (reducing regularization bias), (ii) allows one to
plug in arbitrary models for learning, and (iii) possesses a quasi-oracle
convergence guarantee under mild assumptions. In experiments with small-world
and molecular graphs we demonstrate that our approach outperforms prior work in
CATE estimation.Comment: NeurIPS 2021 Camera-Ready submissio
Deep Learning of Potential Outcomes
This review systematizes the emerging literature for causal inference using
deep neural networks under the potential outcomes framework. It provides an
intuitive introduction on how deep learning can be used to estimate/predict
heterogeneous treatment effects and extend causal inference to settings where
confounding is non-linear, time varying, or encoded in text, networks, and
images. To maximize accessibility, we also introduce prerequisite concepts from
causal inference and deep learning. The survey differs from other treatments of
deep learning and causal inference in its sharp focus on observational causal
estimation, its extended exposition of key algorithms, and its detailed
tutorials for implementing, training, and selecting among deep estimators in
Tensorflow 2 available at github.com/kochbj/Deep-Learning-for-Causal-Inference
Sharp Bounds for Generalized Causal Sensitivity Analysis
Causal inference from observational data is crucial for many disciplines such
as medicine and economics. However, sharp bounds for causal effects under
relaxations of the unconfoundedness assumption (causal sensitivity analysis)
are subject to ongoing research. So far, works with sharp bounds are restricted
to fairly simple settings (e.g., a single binary treatment). In this paper, we
propose a unified framework for causal sensitivity analysis under unobserved
confounding in various settings. For this, we propose a flexible generalization
of the marginal sensitivity model (MSM) and then derive sharp bounds for a
large class of causal effects. This includes (conditional) average treatment
effects, effects for mediation analysis and path analysis, and distributional
effects. Furthermore, our sensitivity model is applicable to discrete,
continuous, and time-varying treatments. It allows us to interpret the partial
identification problem under unobserved confounding as a distribution shift in
the latent confounders while evaluating the causal effect of interest. In the
special case of a single binary treatment, our bounds for (conditional) average
treatment effects coincide with recent optimality results for causal
sensitivity analysis. Finally, we propose a scalable algorithm to estimate our
sharp bounds from observational data.Comment: Accepted at NeurIPS 202
Deep Causal Learning: Representation, Discovery and Inference
Causal learning has attracted much attention in recent years because
causality reveals the essential relationship between things and indicates how
the world progresses. However, there are many problems and bottlenecks in
traditional causal learning methods, such as high-dimensional unstructured
variables, combinatorial optimization problems, unknown intervention,
unobserved confounders, selection bias and estimation bias. Deep causal
learning, that is, causal learning based on deep neural networks, brings new
insights for addressing these problems. While many deep learning-based causal
discovery and causal inference methods have been proposed, there is a lack of
reviews exploring the internal mechanism of deep learning to improve causal
learning. In this article, we comprehensively review how deep learning can
contribute to causal learning by addressing conventional challenges from three
aspects: representation, discovery, and inference. We point out that deep
causal learning is important for the theoretical extension and application
expansion of causal science and is also an indispensable part of general
artificial intelligence. We conclude the article with a summary of open issues
and potential directions for future work
Data synthesis and adversarial networks: A review and meta-analysis in cancer imaging
Despite technological and medical advances, the detection, interpretation, and treatment of cancer based on imaging data continue to pose significant challenges. These include inter-observer variability, class imbalance, dataset shifts, inter- and intra-tumour heterogeneity, malignancy determination, and treatment effect uncertainty. Given the recent advancements in image synthesis, Generative Adversarial Networks (GANs), and adversarial training, we assess the potential of these technologies to address a number of key challenges of cancer imaging. We categorise these challenges into (a) data scarcity and imbalance, (b) data access and privacy, (c) data annotation and segmentation, (d) cancer detection and diagnosis, and (e) tumour profiling, treatment planning and monitoring. Based on our analysis of 164 publications that apply adversarial training techniques in the context of cancer imaging, we highlight multiple underexplored solutions with research potential. We further contribute the Synthesis Study Trustworthiness Test (SynTRUST), a meta-analysis framework for assessing the validation rigour of medical image synthesis studies. SynTRUST is based on 26 concrete measures of thoroughness, reproducibility, usefulness, scalability, and tenability. Based on SynTRUST, we analyse 16 of the most promising cancer imaging challenge solutions and observe a high validation rigour in general, but also several desirable improvements. With this work, we strive to bridge the gap between the needs of the clinical cancer imaging community and the current and prospective research on data synthesis and adversarial networks in the artificial intelligence community