A ubiquitous challenge in machine learning is the problem of domain
generalisation. This can exacerbate bias against groups or labels that are
underrepresented in the datasets used for model development. Model bias can
lead to unintended harms, especially in safety-critical applications like
healthcare. Furthermore, the challenge is compounded by the difficulty of
obtaining labelled data due to high cost or lack of readily available domain
expertise. In our work, we show that learning realistic augmentations
automatically from data is possible in a label-efficient manner using
generative models. In particular, we leverage the higher abundance of
unlabelled data to capture the underlying data distribution of different
conditions and subgroups for an imaging modality. By conditioning generative
models on appropriate labels, we can steer the distribution of synthetic
examples according to specific requirements. We demonstrate that these learned
augmentations can surpass heuristic ones by making models more robust and
statistically fair in- and out-of-distribution. To evaluate the generality of
our approach, we study 3 distinct medical imaging contexts of varying
difficulty: (i) histopathology images from a publicly available generalisation
benchmark, (ii) chest X-rays from publicly available clinical datasets, and
(iii) dermatology images characterised by complex shifts and imaging
conditions. Complementing real training samples with synthetic ones improves
the robustness of models in all three medical tasks and increases fairness by
improving the accuracy of diagnosis within underrepresented groups. This
approach leads to stark improvements OOD across modalities: 7.7% prediction
accuracy improvement in histopathology, 5.2% in chest radiology with 44.6%
lower fairness gap and a striking 63.5% improvement in high-risk sensitivity
for dermatology with a 7.5x reduction in fairness gap