27 research outputs found
A step towards the applicability of algorithms based on invariant causal learning on observational data
Machine learning can benefit from causal discovery for interpretation and
from causal inference for generalization. In this line of research, a few
invariant learning algorithms for out-of-distribution (OOD) generalization have
been proposed by using multiple training environments to find invariant
relationships. Some of them are focused on causal discovery as Invariant Causal
Prediction (ICP), which finds causal parents of a variable of interest, and
some directly provide a causal optimal predictor that generalizes well in OOD
environments as Invariant Risk Minimization (IRM). This group of algorithms
works under the assumption of multiple environments that represent different
interventions in the causal inference context. Those environments are not
normally available when working with observational data and real-world
applications. Here we propose a method to generate them in an efficient way. We
assess the performance of this unsupervised learning problem by implementing
ICP on simulated data. We also show how to apply ICP efficiently integrated
with our method for causal discovery. Finally, we proposed an improved version
of our method in combination with ICP for datasets with multiple covariates
where ICP and other causal discovery methods normally degrade in performance
The Hierarchy of Stable Distributions and Operators to Trade Off Stability and Performance
Recent work addressing model reliability and generalization has resulted in a
variety of methods that seek to proactively address differences between the
training and unknown target environments. While most methods achieve this by
finding distributions that will be invariant across environments, we will show
they do not necessarily find the same distributions which has implications for
performance. In this paper we unify existing work on prediction using stable
distributions by relating environmental shifts to edges in the graph underlying
a prediction problem, and characterize stable distributions as those which
effectively remove these edges. We then quantify the effect of edge deletion on
performance in the linear case and corroborate the findings in a simulated and
real data experiment