We study the class of location-scale or heteroscedastic noise models (LSNMs),
in which the effect Y can be written as a function of the cause X and a
noise source N independent of X, which may be scaled by a positive function
g over the cause, i.e., Y=f(X)+g(X)N. Despite the generality of the
model class, we show the causal direction is identifiable up to some
pathological cases. To empirically validate these theoretical findings, we
propose two estimators for LSNMs: an estimator based on (non-linear) feature
maps, and one based on neural networks. Both model the conditional distribution
of Y given X as a Gaussian parameterized by its natural parameters. When
the feature maps are correctly specified, we prove that our estimator is
jointly concave, and a consistent estimator for the cause-effect identification
task. Although the the neural network does not inherit those guarantees, it can
fit functions of arbitrary complexity, and reaches state-of-the-art performance
across benchmarks.Comment: ICML 202