79 research outputs found
CausaLM: Causal Model Explanation Through Counterfactual Language Models
Understanding predictions made by deep neural networks is notoriously
difficult, but also crucial to their dissemination. As all ML-based methods,
they are as good as their training data, and can also capture unwanted biases.
While there are tools that can help understand whether such biases exist, they
do not distinguish between correlation and causation, and might be ill-suited
for text-based models and for reasoning about high level language concepts. A
key problem of estimating the causal effect of a concept of interest on a given
model is that this estimation requires the generation of counterfactual
examples, which is challenging with existing generation technology. To bridge
that gap, we propose CausaLM, a framework for producing causal model
explanations using counterfactual language representation models. Our approach
is based on fine-tuning of deep contextualized embedding models with auxiliary
adversarial tasks derived from the causal graph of the problem. Concretely, we
show that by carefully choosing auxiliary adversarial pre-training tasks,
language representation models such as BERT can effectively learn a
counterfactual representation for a given concept of interest, and be used to
estimate its true causal effect on model performance. A byproduct of our method
is a language representation model that is unaffected by the tested concept,
which can be useful in mitigating unwanted bias ingrained in the data.Comment: Our code and data are available at:
https://amirfeder.github.io/CausaLM/ Under review for the Computational
Linguistics journa
Adversarial De-confounding in Individualised Treatment Effects Estimation
Observational studies have recently received significant attention from the
machine learning community due to the increasingly available non-experimental
observational data and the limitations of the experimental studies, such as
considerable cost, impracticality, small and less representative sample sizes,
etc. In observational studies, de-confounding is a fundamental problem of
individualised treatment effects (ITE) estimation. This paper proposes
disentangled representations with adversarial training to selectively balance
the confounders in the binary treatment setting for the ITE estimation. The
adversarial training of treatment policy selectively encourages
treatment-agnostic balanced representations for the confounders and helps to
estimate the ITE in the observational studies via counterfactual inference.
Empirical results on synthetic and real-world datasets, with varying degrees of
confounding, prove that our proposed approach improves the state-of-the-art
methods in achieving lower error in the ITE estimation.Comment: accepted to AISTATS 202
Deep Causal Learning for Robotic Intelligence
This invited review discusses causal learning in the context of robotic
intelligence. The paper introduced the psychological findings on causal
learning in human cognition, then it introduced the traditional statistical
solutions on causal discovery and causal inference. The paper reviewed recent
deep causal learning algorithms with a focus on their architectures and the
benefits of using deep nets and discussed the gap between deep causal learning
and the needs of robotic intelligence
A Foundational Framework and Methodology for Personalized Early and Timely Diagnosis
Early diagnosis of diseases holds the potential for deep transformation in
healthcare by enabling better treatment options, improving long-term survival
and quality of life, and reducing overall cost. With the advent of medical big
data, advances in diagnostic tests as well as in machine learning and
statistics, early or timely diagnosis seems within reach. Early diagnosis
research often neglects the potential for optimizing individual diagnostic
paths. To enable personalized early diagnosis, a foundational framework is
needed that delineates the diagnosis process and systematically identifies the
time-dependent value of various diagnostic tests for an individual patient
given their unique characteristics. Here, we propose the first foundational
framework for early and timely diagnosis. It builds on decision-theoretic
approaches to outline the diagnosis process and integrates machine learning and
statistical methodology for estimating the optimal personalized diagnostic
path. To describe the proposed framework as well as possibly other frameworks,
we provide essential definitions.
The development of a foundational framework is necessary for several reasons:
1) formalism provides clarity for the development of decision support tools; 2)
observed information can be complemented with estimates of the future patient
trajectory; 3) the net benefit of counterfactual diagnostic paths and
associated uncertainties can be modeled for individuals 4) 'early' and 'timely'
diagnosis can be clearly defined; 5) a mechanism emerges for assessing the
value of technologies in terms of their impact on personalized early diagnosis,
resulting health outcomes and incurred costs.
Finally, we hope that this foundational framework will unlock the
long-awaited potential of timely diagnosis and intervention, leading to
improved outcomes for patients and higher cost-effectiveness for healthcare
systems.Comment: 10 pages, 2 figure
Bayesian Neural Controlled Differential Equations for Treatment Effect Estimation
Treatment effect estimation in continuous time is crucial for personalized
medicine. However, existing methods for this task are limited to point
estimates of the potential outcomes, whereas uncertainty estimates have been
ignored. Needless to say, uncertainty quantification is crucial for reliable
decision-making in medical applications. To fill this gap, we propose a novel
Bayesian neural controlled differential equation (BNCDE) for treatment effect
estimation in continuous time. In our BNCDE, the time dimension is modeled
through a coupled system of neural controlled differential equations and neural
stochastic differential equations, where the neural stochastic differential
equations allow for tractable variational Bayesian inference. Thereby, for an
assigned sequence of treatments, our BNCDE provides meaningful posterior
predictive distributions of the potential outcomes. To the best of our
knowledge, ours is the first tailored neural method to provide uncertainty
estimates of treatment effects in continuous time. As such, our method is of
direct practical value for promoting reliable decision-making in medicine
Deep Causal Learning: Representation, Discovery and Inference
Causal learning has attracted much attention in recent years because
causality reveals the essential relationship between things and indicates how
the world progresses. However, there are many problems and bottlenecks in
traditional causal learning methods, such as high-dimensional unstructured
variables, combinatorial optimization problems, unknown intervention,
unobserved confounders, selection bias and estimation bias. Deep causal
learning, that is, causal learning based on deep neural networks, brings new
insights for addressing these problems. While many deep learning-based causal
discovery and causal inference methods have been proposed, there is a lack of
reviews exploring the internal mechanism of deep learning to improve causal
learning. In this article, we comprehensively review how deep learning can
contribute to causal learning by addressing conventional challenges from three
aspects: representation, discovery, and inference. We point out that deep
causal learning is important for the theoretical extension and application
expansion of causal science and is also an indispensable part of general
artificial intelligence. We conclude the article with a summary of open issues
and potential directions for future work
Estimating average causal effects from patient trajectories
In medical practice, treatments are selected based on the expected causal
effects on patient outcomes. Here, the gold standard for estimating causal
effects are randomized controlled trials; however, such trials are costly and
sometimes even unethical. Instead, medical practice is increasingly interested
in estimating causal effects among patient (sub)groups from electronic health
records, that is, observational data. In this paper, we aim at estimating the
average causal effect (ACE) from observational data (patient trajectories) that
are collected over time. For this, we propose DeepACE: an end-to-end deep
learning model. DeepACE leverages the iterative G-computation formula to adjust
for the bias induced by time-varying confounders. Moreover, we develop a novel
sequential targeting procedure which ensures that DeepACE has favorable
theoretical properties, i.e., is doubly robust and asymptotically efficient. To
the best of our knowledge, this is the first work that proposes an end-to-end
deep learning model tailored for estimating time-varying ACEs. We compare
DeepACE in an extensive number of experiments, confirming that it achieves
state-of-the-art performance. We further provide a case study for patients
suffering from low back pain to demonstrate that DeepACE generates important
and meaningful findings for clinical practice. Our work enables practitioners
to develop effective treatment recommendations based on population effects.Comment: Accepted at AAAI 202
- …