4,866 research outputs found

    DeepMed: Semiparametric Causal Mediation Analysis with Debiased Deep Learning

    Full text link
    Causal mediation analysis can unpack the black box of causality and is therefore a powerful tool for disentangling causal pathways in biomedical and social sciences, and also for evaluating machine learning fairness. To reduce bias for estimating Natural Direct and Indirect Effects in mediation analysis, we propose a new method called DeepMed that uses deep neural networks (DNNs) to cross-fit the infinite-dimensional nuisance functions in the efficient influence functions. We obtain novel theoretical results that our DeepMed method (1) can achieve semiparametric efficiency bound without imposing sparsity constraints on the DNN architecture and (2) can adapt to certain low dimensional structures of the nuisance functions, significantly advancing the existing literature on DNN-based semiparametric causal inference. Extensive synthetic experiments are conducted to support our findings and also expose the gap between theory and practice. As a proof of concept, we apply DeepMed to analyze two real datasets on machine learning fairness and reach conclusions consistent with previous findings.Comment: Accepted by NeurIPS 202

    DESCN: Deep Entire Space Cross Networks for Individual Treatment Effect Estimation

    Full text link
    Causal Inference has wide applications in various areas such as E-commerce and precision medicine, and its performance heavily relies on the accurate estimation of the Individual Treatment Effect (ITE). Conventionally, ITE is predicted by modeling the treated and control response functions separately in their individual sample spaces. However, such an approach usually encounters two issues in practice, i.e. divergent distribution between treated and control groups due to treatment bias, and significant sample imbalance of their population sizes. This paper proposes Deep Entire Space Cross Networks (DESCN) to model treatment effects from an end-to-end perspective. DESCN captures the integrated information of the treatment propensity, the response, and the hidden treatment effect through a cross network in a multi-task learning manner. Our method jointly learns the treatment and response functions in the entire sample space to avoid treatment bias and employs an intermediate pseudo treatment effect prediction network to relieve sample imbalance. Extensive experiments are conducted on a synthetic dataset and a large-scaled production dataset from the E-commerce voucher distribution business. The results indicate that DESCN can successfully enhance the accuracy of ITE estimation and improve the uplift ranking performance. A sample of the production dataset and the source code are released to facilitate future research in the community, which is, to the best of our knowledge, the first large-scale public biased treatment dataset for causal inference.Comment: Accepted by SIGKDD 2022 Applied Data Science Trac

    Bayesian inference from photometric redshift surveys

    Full text link
    We show how to enhance the redshift accuracy of surveys consisting of tracers with highly uncertain positions along the line of sight. Photometric surveys with redshift uncertainty delta_z ~ 0.03 can yield final redshift uncertainties of delta_z_f ~ 0.003 in high density regions. This increased redshift precision is achieved by imposing an isotropy and 2-point correlation prior in a Bayesian analysis and is completely independent of the process that estimates the photometric redshift. As a byproduct, the method also infers the three dimensional density field, essentially super-resolving high density regions in redshift space. Our method fully takes into account the survey mask and selection function. It uses a simplified Poissonian picture of galaxy formation, relating preferred locations of galaxies to regions of higher density in the matter field. The method quantifies the remaining uncertainties in the three dimensional density field and the true radial locations of galaxies by generating samples that are constrained by the survey data. The exploration of this high dimensional, non-Gaussian joint posterior is made feasible using multiple-block Metropolis-Hastings sampling. We demonstrate the performance of our implementation on a simulation containing 2.0 x 10^7 galaxies. These results bear out the promise of Bayesian analysis for upcoming photometric large scale structure surveys with tens of millions of galaxies.Comment: 17 pages, 12 figure

    Deep Partial Least Squares for Instrumental Variable Regression

    Full text link
    In this paper, we propose deep partial least squares for the estimation of high-dimensional nonlinear instrumental variable regression. As a precursor to a flexible deep neural network architecture, our methodology uses partial least squares for dimension reduction and feature selection from the set of instruments and covariates. A central theoretical result, due to Brillinger (2012) shows that the feature selection provided by partial least squares is consistent and the weights are estimated up to a proportionality constant. We illustrate our methodology with synthetic datasets with a sparse and correlated network structure and draw applications to the effect of childbearing on the mother's labor supply based on classic data of Angrist and Evans (1996). The results on synthetic data as well as applications show that the deep partial least squares method significantly outperforms other related methods. Finally, we conclude with directions for future research
    • …
    corecore