Noisy training labels can hurt model performance. Most approaches that aim to
address label noise assume label noise is independent from the input features.
In practice, however, label noise is often feature or
\textit{instance-dependent}, and therefore biased (i.e., some instances are
more likely to be mislabeled than others). E.g., in clinical care, female
patients are more likely to be under-diagnosed for cardiovascular disease
compared to male patients. Approaches that ignore this dependence can produce
models with poor discriminative performance, and in many healthcare settings,
can exacerbate issues around health disparities. In light of these limitations,
we propose a two-stage approach to learn in the presence instance-dependent
label noise. Our approach utilizes \textit{\anchor points}, a small subset of
data for which we know the observed and ground truth labels. On several tasks,
our approach leads to consistent improvements over the state-of-the-art in
discriminative performance (AUROC) while mitigating bias (area under the
equalized odds curve, AUEOC). For example, when predicting acute respiratory
failure onset on the MIMIC-III dataset, our approach achieves a harmonic mean
(AUROC and AUEOC) of 0.84 (SD [standard deviation] 0.01) while that of the next
best baseline is 0.81 (SD 0.01). Overall, our approach improves accuracy while
mitigating potential bias compared to existing approaches in the presence of
instance-dependent label noise