4,866 research outputs found
DeepMed: Semiparametric Causal Mediation Analysis with Debiased Deep Learning
Causal mediation analysis can unpack the black box of causality and is
therefore a powerful tool for disentangling causal pathways in biomedical and
social sciences, and also for evaluating machine learning fairness. To reduce
bias for estimating Natural Direct and Indirect Effects in mediation analysis,
we propose a new method called DeepMed that uses deep neural networks (DNNs) to
cross-fit the infinite-dimensional nuisance functions in the efficient
influence functions. We obtain novel theoretical results that our DeepMed
method (1) can achieve semiparametric efficiency bound without imposing
sparsity constraints on the DNN architecture and (2) can adapt to certain low
dimensional structures of the nuisance functions, significantly advancing the
existing literature on DNN-based semiparametric causal inference. Extensive
synthetic experiments are conducted to support our findings and also expose the
gap between theory and practice. As a proof of concept, we apply DeepMed to
analyze two real datasets on machine learning fairness and reach conclusions
consistent with previous findings.Comment: Accepted by NeurIPS 202
DESCN: Deep Entire Space Cross Networks for Individual Treatment Effect Estimation
Causal Inference has wide applications in various areas such as E-commerce
and precision medicine, and its performance heavily relies on the accurate
estimation of the Individual Treatment Effect (ITE). Conventionally, ITE is
predicted by modeling the treated and control response functions separately in
their individual sample spaces. However, such an approach usually encounters
two issues in practice, i.e. divergent distribution between treated and control
groups due to treatment bias, and significant sample imbalance of their
population sizes. This paper proposes Deep Entire Space Cross Networks (DESCN)
to model treatment effects from an end-to-end perspective. DESCN captures the
integrated information of the treatment propensity, the response, and the
hidden treatment effect through a cross network in a multi-task learning
manner. Our method jointly learns the treatment and response functions in the
entire sample space to avoid treatment bias and employs an intermediate pseudo
treatment effect prediction network to relieve sample imbalance. Extensive
experiments are conducted on a synthetic dataset and a large-scaled production
dataset from the E-commerce voucher distribution business. The results indicate
that DESCN can successfully enhance the accuracy of ITE estimation and improve
the uplift ranking performance. A sample of the production dataset and the
source code are released to facilitate future research in the community, which
is, to the best of our knowledge, the first large-scale public biased treatment
dataset for causal inference.Comment: Accepted by SIGKDD 2022 Applied Data Science Trac
Bayesian inference from photometric redshift surveys
We show how to enhance the redshift accuracy of surveys consisting of tracers
with highly uncertain positions along the line of sight. Photometric surveys
with redshift uncertainty delta_z ~ 0.03 can yield final redshift uncertainties
of delta_z_f ~ 0.003 in high density regions. This increased redshift precision
is achieved by imposing an isotropy and 2-point correlation prior in a Bayesian
analysis and is completely independent of the process that estimates the
photometric redshift. As a byproduct, the method also infers the three
dimensional density field, essentially super-resolving high density regions in
redshift space. Our method fully takes into account the survey mask and
selection function. It uses a simplified Poissonian picture of galaxy
formation, relating preferred locations of galaxies to regions of higher
density in the matter field. The method quantifies the remaining uncertainties
in the three dimensional density field and the true radial locations of
galaxies by generating samples that are constrained by the survey data. The
exploration of this high dimensional, non-Gaussian joint posterior is made
feasible using multiple-block Metropolis-Hastings sampling. We demonstrate the
performance of our implementation on a simulation containing 2.0 x 10^7
galaxies. These results bear out the promise of Bayesian analysis for upcoming
photometric large scale structure surveys with tens of millions of galaxies.Comment: 17 pages, 12 figure
Deep Partial Least Squares for Instrumental Variable Regression
In this paper, we propose deep partial least squares for the estimation of
high-dimensional nonlinear instrumental variable regression. As a precursor to
a flexible deep neural network architecture, our methodology uses partial least
squares for dimension reduction and feature selection from the set of
instruments and covariates. A central theoretical result, due to Brillinger
(2012) shows that the feature selection provided by partial least squares is
consistent and the weights are estimated up to a proportionality constant. We
illustrate our methodology with synthetic datasets with a sparse and correlated
network structure and draw applications to the effect of childbearing on the
mother's labor supply based on classic data of Angrist and Evans (1996). The
results on synthetic data as well as applications show that the deep partial
least squares method significantly outperforms other related methods. Finally,
we conclude with directions for future research
- …