8 research outputs found
Domain Generalization -- A Causal Perspective
Machine learning models rely on various assumptions to attain high accuracy.
One of the preliminary assumptions of these models is the independent and
identical distribution, which suggests that the train and test data are sampled
from the same distribution. However, this assumption seldom holds in the real
world due to distribution shifts. As a result models that rely on this
assumption exhibit poor generalization capabilities. Over the recent years,
dedicated efforts have been made to improve the generalization capabilities of
these models collectively known as -- \textit{domain generalization methods}.
The primary idea behind these methods is to identify stable features or
mechanisms that remain invariant across the different distributions. Many
generalization approaches employ causal theories to describe invariance since
causality and invariance are inextricably intertwined. However, current surveys
deal with the causality-aware domain generalization methods on a very
high-level. Furthermore, we argue that it is possible to categorize the methods
based on how causality is leveraged in that method and in which part of the
model pipeline is it used. To this end, we categorize the causal domain
generalization methods into three categories, namely, (i) Invariance via Causal
Data Augmentation methods which are applied during the data pre-processing
stage, (ii) Invariance via Causal representation learning methods that are
utilized during the representation learning stage, and (iii) Invariance via
Transferring Causal mechanisms methods that are applied during the
classification stage of the pipeline. Furthermore, this survey includes
in-depth insights into benchmark datasets and code repositories for domain
generalization methods. We conclude the survey with insights and discussions on
future directions
Regularizing towards Causal Invariance: Linear Models with Proxies
We propose a method for learning linear models whose predictive performance
is robust to causal interventions on unobserved variables, when noisy proxies
of those variables are available. Our approach takes the form of a
regularization term that trades off between in-distribution performance and
robustness to interventions. Under the assumption of a linear structural causal
model, we show that a single proxy can be used to create estimators that are
prediction optimal under interventions of bounded strength. This strength
depends on the magnitude of the measurement noise in the proxy, which is, in
general, not identifiable. In the case of two proxy variables, we propose a
modified estimator that is prediction optimal under interventions up to a known
strength. We further show how to extend these estimators to scenarios where
additional information about the "test time" intervention is available during
training. We evaluate our theoretical findings in synthetic experiments and
using real data of hourly pollution levels across several cities in China.Comment: ICML 2021 (to appear