70 research outputs found
Domain Generalization -- A Causal Perspective
Machine learning models rely on various assumptions to attain high accuracy.
One of the preliminary assumptions of these models is the independent and
identical distribution, which suggests that the train and test data are sampled
from the same distribution. However, this assumption seldom holds in the real
world due to distribution shifts. As a result models that rely on this
assumption exhibit poor generalization capabilities. Over the recent years,
dedicated efforts have been made to improve the generalization capabilities of
these models collectively known as -- \textit{domain generalization methods}.
The primary idea behind these methods is to identify stable features or
mechanisms that remain invariant across the different distributions. Many
generalization approaches employ causal theories to describe invariance since
causality and invariance are inextricably intertwined. However, current surveys
deal with the causality-aware domain generalization methods on a very
high-level. Furthermore, we argue that it is possible to categorize the methods
based on how causality is leveraged in that method and in which part of the
model pipeline is it used. To this end, we categorize the causal domain
generalization methods into three categories, namely, (i) Invariance via Causal
Data Augmentation methods which are applied during the data pre-processing
stage, (ii) Invariance via Causal representation learning methods that are
utilized during the representation learning stage, and (iii) Invariance via
Transferring Causal mechanisms methods that are applied during the
classification stage of the pipeline. Furthermore, this survey includes
in-depth insights into benchmark datasets and code repositories for domain
generalization methods. We conclude the survey with insights and discussions on
future directions
NOTES2: Networks-of-Traces for Epidemic Spread Simulations
Decision making and intervention against infectious diseases require analysis of large volumes of data, including demographic data, contact networks, agespecific contact rates, mobility networks, and healthcare and control intervention data and models. In this paper, we present our Networks-Of-Traces for Epidemic Spread Simulations (NOTES2) model and system which aim at assisting experts and helping them explore existing simulation trace data sets. NOTES2 supports analysis and indexing of simulation data sets as well as parameter and feature analysis, including identification of unknown dependencies across the input parameters and output variables spanning the different layers of the observation and simulation data
- …