Rates of missing data often depend on record-keeping policies and thus may
change across times and locations, even when the underlying features are
comparatively stable. In this paper, we introduce the problem of Domain
Adaptation under Missingness Shift (DAMS). Here, (labeled) source data and
(unlabeled) target data would be exchangeable but for different missing data
mechanisms. We show that when missing data indicators are available, DAMS can
reduce to covariate shift. Focusing on the setting where missing data
indicators are absent, we establish the following theoretical results for
underreporting completely at random: (i) covariate shift is violated
(adaptation is required); (ii) the optimal source predictor can perform worse
on the target domain than a constant one; (iii) the optimal target predictor
can be identified, even when the missingness rates themselves are not; and (iv)
for linear models, a simple analytic adjustment yields consistent estimates of
the optimal target parameters. In experiments on synthetic and semi-synthetic
data, we demonstrate the promise of our methods when assumptions hold. Finally,
we discuss a rich family of future extensions