1 research outputs found
Stochastic Doubly Robust Gradient
When training a machine learning model with observational data, it is often
encountered that some values are systemically missing. Learning from the
incomplete data in which the missingness depends on some covariates may lead to
biased estimation of parameters and even harm the fairness of decision outcome.
This paper proposes how to adjust the causal effect of covariates on the
missingness when training models using stochastic gradient descent (SGD).
Inspired by the design of doubly robust estimator and its theoretical property
of double robustness, we introduce stochastic doubly robust gradient (SDRG)
consisting of two models: weight-corrected gradients for inverse propensity
score weighting and per-covariate control variates for regression adjustment.
Also, we identify the connection between double robustness and variance
reduction in SGD by demonstrating the SDRG algorithm with a unifying framework
for variance reduced SGD. The performance of our approach is empirically tested
by showing the convergence in training image classifiers with several examples
of missing data.Comment: 9 pages, 2 figure