379 research outputs found
Rational Shapley Values
Explaining the predictions of opaque machine learning algorithms is an
important and challenging task, especially as complex models are increasingly
used to assist in high-stakes decisions such as those arising in healthcare and
finance. Most popular tools for post-hoc explainable artificial intelligence
(XAI) are either insensitive to context (e.g., feature attributions) or
difficult to summarize (e.g., counterfactuals). In this paper, I introduce
\emph{rational Shapley values}, a novel XAI method that synthesizes and extends
these seemingly incompatible approaches in a rigorous, flexible manner. I
leverage tools from decision theory and causal modeling to formalize and
implement a pragmatic approach that resolves a number of known challenges in
XAI. By pairing the distribution of random variables with the appropriate
reference class for a given explanation task, I illustrate through theory and
experiments how user goals and knowledge can inform and constrain the solution
set in an iterative fashion. The method compares favorably to state of the art
XAI tools in a range of quantitative and qualitative comparisons.Comment: 20 pages, 3 figures, 7 table
Intrinsically Motivated Reinforcement Learning based Recommendation with Counterfactual Data Augmentation
Deep reinforcement learning (DRL) has been proven its efficiency in capturing
users' dynamic interests in recent literature. However, training a DRL agent is
challenging, because of the sparse environment in recommender systems (RS), DRL
agents could spend times either exploring informative user-item interaction
trajectories or using existing trajectories for policy learning. It is also
known as the exploration and exploitation trade-off which affects the
recommendation performance significantly when the environment is sparse. It is
more challenging to balance the exploration and exploitation in DRL RS where RS
agent need to deeply explore the informative trajectories and exploit them
efficiently in the context of recommender systems. As a step to address this
issue, We design a novel intrinsically ,otivated reinforcement learning method
to increase the capability of exploring informative interaction trajectories in
the sparse environment, which are further enriched via a counterfactual
augmentation strategy for more efficient exploitation. The extensive
experiments on six offline datasets and three online simulation platforms
demonstrate the superiority of our model to a set of existing state-of-the-art
methods
Does Misclassifying Non-confounding Covariates as Confounders Affect the Causal Inference within the Potential Outcomes Framework?
The Potential Outcome Framework (POF) plays a prominent role in the field of
causal inference. Most causal inference models based on the POF (CIMs-POF) are
designed for eliminating confounding bias and default to an underlying
assumption of Confounding Covariates. This assumption posits that the
covariates consist solely of confounders. However, the assumption of
Confounding Covariates is challenging to maintain in practice, particularly
when dealing with high-dimensional covariates. While certain methods have been
proposed to differentiate the distinct components of covariates prior to
conducting causal inference, the consequences of treating non-confounding
covariates as confounders remain unclear. This ambiguity poses a potential risk
when conducting causal inference in practical scenarios. In this paper, we
present a unified graphical framework for the CIMs-POF, which greatly enhances
the comprehension of these models' underlying principles. Using this graphical
framework, we quantitatively analyze the extent to which the inference
performance of CIMs-POF is influenced when incorporating various types of
non-confounding covariates, such as instrumental variables, mediators,
colliders, and adjustment variables. The key findings are: in the task of
eliminating confounding bias, the optimal scenario is for the covariates to
exclusively encompass confounders; in the subsequent task of inferring
counterfactual outcomes, the adjustment variables contribute to more accurate
inferences. Furthermore, extensive experiments conducted on synthetic datasets
consistently validate these theoretical conclusions.Comment: 12 pages, 4 figure
Less is Better: Recovering Intended-Feature Subspace to Robustify NLU Models
Datasets with significant proportions of bias present threats for training a
trustworthy model on NLU tasks. Despite yielding great progress, current
debiasing methods impose excessive reliance on the knowledge of bias
attributes. Definition of the attributes, however, is elusive and varies across
different datasets. Furthermore, leveraging these attributes at input level to
bias mitigation may leave a gap between intrinsic properties and the underlying
decision rule. To narrow down this gap and liberate the supervision on bias, we
suggest extending bias mitigation into feature space. Therefore, a novel model,
Recovering Intended-Feature Subspace with Knowledge-Free (RISK) is developed.
Assuming that shortcut features caused by various biases are unintended for
prediction, RISK views them as redundant features. When delving into a lower
manifold to remove redundancies, RISK reveals that an extremely low-dimensional
subspace with intended features can robustly represent the highly biased
dataset. Empirical results demonstrate our model can consistently improve model
generalization to out-of-distribution set, and achieves a new state-of-the-art
performance.Comment: Acceptted by COLING 202
De-confounding Representation Learning for Counterfactual Inference on Continuous Treatment via Generative Adversarial Network
Counterfactual inference for continuous rather than binary treatment
variables is more common in real-world causal inference tasks. While there are
already some sample reweighting methods based on Marginal Structural Model for
eliminating the confounding bias, they generally focus on removing the
treatment's linear dependence on confounders and rely on the accuracy of the
assumed parametric models, which are usually unverifiable. In this paper, we
propose a de-confounding representation learning (DRL) framework for
counterfactual outcome estimation of continuous treatment by generating the
representations of covariates disentangled with the treatment variables. The
DRL is a non-parametric model that eliminates both linear and nonlinear
dependence between treatment and covariates. Specifically, we train the
correlations between the de-confounded representations and the treatment
variables against the correlations between the covariate representations and
the treatment variables to eliminate confounding bias. Further, a
counterfactual inference network is embedded into the framework to make the
learned representations serve both de-confounding and trusted inference.
Extensive experiments on synthetic datasets show that the DRL model performs
superiorly in learning de-confounding representations and outperforms
state-of-the-art counterfactual inference models for continuous treatment
variables. In addition, we apply the DRL model to a real-world medical dataset
MIMIC and demonstrate a detailed causal relationship between red cell width
distribution and mortality.Comment: 15 pages,4 figure
Advancing Counterfactual Inference through Quantile Regression
The capacity to address counterfactual "what if" inquiries is crucial for
understanding and making use of causal influences. Traditional counterfactual
inference usually assumes a structural causal model is available. However, in
practice, such a causal model is often unknown and may not be identifiable.
This paper aims to perform reliable counterfactual inference based on the
(learned) qualitative causal structure and observational data, without a given
causal model or even directly estimating conditional distributions. We re-cast
counterfactual reasoning as an extended quantile regression problem using
neural networks. The approach is statistically more efficient than existing
ones, and further makes it possible to develop the generalization ability of
the estimated counterfactual outcome to unseen data and provide an upper bound
on the generalization error. Experiment results on multiple datasets strongly
support our theoretical claims
- …