Perturbations and Causality in Gaussian Latent Variable Models

Abstract

Causal inference is a challenging problem with observational data alone. The task becomes easier when having access to data from perturbing the underlying system, even when happening in a non-randomized way: this is the setting we consider, encompassing also latent confounding variables. To identify causal relations among a collections of covariates and a response variable, existing procedures rely on at least one of the following assumptions: i) the response variable remains unperturbed, ii) the latent variables remain unperturbed, and iii) the latent effects are dense. In this paper, we examine a perturbation model for interventional data, which can be viewed as a mixed-effects linear structural causal model, over a collection of Gaussian variables that does not satisfy any of these conditions. We propose a maximum-likelihood estimator -- dubbed DirectLikelihood -- that exploits system-wide invariances to uniquely identify the population causal structure from unspecific perturbation data, and our results carry over to linear structural causal models without requiring Gaussianity. We illustrate the utility of our framework on synthetic data as well as real data involving California reservoirs and protein expressions

    Similar works

    Full text

    thumbnail-image

    Available Versions