1,697 research outputs found
Causal Confusion in Imitation Learning
Behavioral cloning reduces policy learning to supervised learning by training
a discriminative model to predict expert actions given observations. Such
discriminative models are non-causal: the training procedure is unaware of the
causal structure of the interaction between the expert and the environment. We
point out that ignoring causality is particularly damaging because of the
distributional shift in imitation learning. In particular, it leads to a
counter-intuitive "causal misidentification" phenomenon: access to more
information can yield worse performance. We investigate how this problem
arises, and propose a solution to combat it through targeted
interventions---either environment interaction or expert queries---to determine
the correct causal model. We show that causal misidentification occurs in
several benchmark control domains as well as realistic driving settings, and
validate our solution against DAgger and other baselines and ablations.Comment: Published at NeurIPS 2019 9 pages, plus references and appendice
Observational-Interventional Priors for Dose-Response Learning
Controlled interventions provide the most direct source of information for
learning causal effects. In particular, a dose-response curve can be learned by
varying the treatment level and observing the corresponding outcomes. However,
interventions can be expensive and time-consuming. Observational data, where
the treatment is not controlled by a known mechanism, is sometimes available.
Under some strong assumptions, observational data allows for the estimation of
dose-response curves. Estimating such curves nonparametrically is hard: sample
sizes for controlled interventions may be small, while in the observational
case a large number of measured confounders may need to be marginalized. In
this paper, we introduce a hierarchical Gaussian process prior that constructs
a distribution over the dose-response curve by learning from observational
data, and reshapes the distribution with a nonparametric affine transform
learned from controlled interventions. This function composition from different
sources is shown to speed-up learning, which we demonstrate with a thorough
sensitivity analysis and an application to modeling the effect of therapy on
cognitive skills of premature infants
Causally Disentangled Generative Variational AutoEncoder
We present a new supervised learning technique for the Variational
AutoEncoder (VAE) that allows it to learn a causally disentangled
representation and generate causally disentangled outcomes simultaneously. We
call this approach Causally Disentangled Generation (CDG). CDG is a generative
model that accurately decodes an output based on a causally disentangled
representation. Our research demonstrates that adding supervised regularization
to the encoder alone is insufficient for achieving a generative model with CDG,
even for a simple task. Therefore, we explore the necessary and sufficient
conditions for achieving CDG within a specific model. Additionally, we
introduce a universal metric for evaluating the causal disentanglement of a
generative model. Empirical results from both image and tabular datasets
support our findings
- …