19 research outputs found
Towards Trustworthy Explanation: On Causal Rationalization
With recent advances in natural language processing, rationalization becomes
an essential self-explaining diagram to disentangle the black box by selecting
a subset of input texts to account for the major variation in prediction. Yet,
existing association-based approaches on rationalization cannot identify true
rationales when two or more snippets are highly inter-correlated and thus
provide a similar contribution to prediction accuracy, so-called spuriousness.
To address this limitation, we novelly leverage two causal desiderata,
non-spuriousness and efficiency, into rationalization from the causal inference
perspective. We formally define a series of probabilities of causation based on
a newly proposed structural causal model of rationalization, with its
theoretical identification established as the main component of learning
necessary and sufficient rationales. The superior performance of the proposed
causal rationalization is demonstrated on real-world review and medical
datasets with extensive experiments compared to state-of-the-art methods.Comment: In Proceedings of the 40th International Conference on Machine
Learning (ICML) GitHub Repository:
https://github.com/onepounchman/Causal-Retionalizatio
On Efficient Inference of Causal Effects with Multiple Mediators
This paper provides robust estimators and efficient inference of causal
effects involving multiple interacting mediators. Most existing works either
impose a linear model assumption among the mediators or are restricted to
handle conditionally independent mediators given the exposure. To overcome
these limitations, we define causal and individual mediation effects in a
general setting, and employ a semiparametric framework to develop quadruply
robust estimators for these causal effects. We further establish the asymptotic
normality of the proposed estimators and prove their local semiparametric
efficiencies. The proposed method is empirically validated via simulated and
real datasets concerning psychiatric disorders in trauma survivors
Deep jump learning for off-policy evaluation in continuous treatment settings
We consider off-policy evaluation (OPE) in continuous treatment settings, such as personalized dose-finding. In OPE, one aims to estimate the mean outcome under a new treatment decision rule using historical data generated by a different decision rule. Most existing works on OPE focus on discrete treatment settings. To handle continuous treatments, we develop a novel estimation method for OPE using deep jump learning. The key ingredient of our method lies in adaptively discretizing the treatment space using deep discretization, by leveraging deep learning and multi-scale change point detection. This allows us to apply existing OPE methods in discrete treatments to handle continuous treatments. Our method is further justified by theoretical results, simulations, and a real application to Warfarin Dosing
CAPITAL: Optimal Subgroup Identification via Constrained Policy Tree Search
Personalized medicine, a paradigm of medicine tailored to a patient's
characteristics, is an increasingly attractive field in health care. An
important goal of personalized medicine is to identify a subgroup of patients,
based on baseline covariates, that benefits more from the targeted treatment
than other comparative treatments. Most of the current subgroup identification
methods only focus on obtaining a subgroup with an enhanced treatment effect
without paying attention to subgroup size. Yet, a clinically meaningful
subgroup learning approach should identify the maximum number of patients who
can benefit from the better treatment. In this paper, we present an optimal
subgroup selection rule (SSR) that maximizes the number of selected patients,
and in the meantime, achieves the pre-specified clinically meaningful mean
outcome, such as the average treatment effect. We derive two equivalent
theoretical forms of the optimal SSR based on the contrast function that
describes the treatment-covariates interaction in the outcome. We further
propose a ConstrAined PolIcy Tree seArch aLgorithm (CAPITAL) to find the
optimal SSR within the interpretable decision tree class. The proposed method
is flexible to handle multiple constraints that penalize the inclusion of
patients with negative treatment effects, and to address time to event data
using the restricted mean survival time as the clinically interesting mean
outcome. Extensive simulations, comparison studies, and real data applications
are conducted to demonstrate the validity and utility of our method
Boundary-aware Contrastive Learning for Semi-supervised Nuclei Instance Segmentation
Semi-supervised segmentation methods have demonstrated promising results in
natural scenarios, providing a solution to reduce dependency on manual
annotation. However, these methods face significant challenges when directly
applied to pathological images due to the subtle color differences between
nuclei and tissues, as well as the significant morphological variations among
nuclei. Consequently, the generated pseudo-labels often contain much noise,
especially at the nuclei boundaries. To address the above problem, this paper
proposes a boundary-aware contrastive learning network to denoise the boundary
noise in a semi-supervised nuclei segmentation task. The model has two key
designs: a low-resolution denoising (LRD) module and a cross-RoI contrastive
learning (CRC) module. The LRD improves the smoothness of the nuclei boundary
by pseudo-labels denoising, and the CRC enhances the discrimination between
foreground and background by boundary feature contrastive learning. We conduct
extensive experiments to demonstrate the superiority of our proposed method
over existing semi-supervised instance segmentation methods.Comment: 12 pages, 3 figures, 6 table