We consider the estimation of average and counterfactual treatment effects,
under two settings: back-door adjustment and front-door adjustment. The goal in
both cases is to recover the treatment effect without having an access to a
hidden confounder. This objective is attained by first estimating the
conditional mean of the desired outcome variable given relevant covariates (the
"first stage" regression), and then taking the (conditional) expectation of
this function as a "second stage" procedure. We propose to compute these
conditional expectations directly using a regression function to the learned
input features of the first stage, thus avoiding the need for sampling or
density estimation. All functions and features (and in particular, the output
features in the second stage) are neural networks learned adaptively from data,
with the sole requirement that the final layer of the first stage should be
linear. The proposed method is shown to converge to the true causal parameter,
and outperforms the recent state-of-the-art methods on challenging causal
benchmarks, including settings involving high-dimensional image data