Robustness of predictive deep models is a challenging problem with many implications.
It is of particular importance when models are used in safety-critical applications,
such as healthcare. However, there is yet to be agreement on a comprehensive definition
on what it means for a model to be robust, and a theory on why these issues arise.
Given the general nature of the problem, existing work related to robustness is spread
across different areas of research. Existing research has considered a range of robustness
aspects, for instance robustness to small input perturbations, which arise from the study
of adversarial examples, but there is also robustness to different domains for the same
task, and robustness issues which arise from object placement, transplanting, lighting,
weather conditions, or object style, as some examples.
This thesis explores a formulation of robustness in terms of the assumed structural causal
model (SCM) which generates the observed data.The SCM allows these different types of
robustness issues to be viewed in a unifying way. Using this view, this work furthers
the connection between prediction robustness and the assumed structural causal model by
suggesting that optimising for prediction performance across a diverse set of distributions
from the same SCM will move the model closer to the causal predictor of the target variable,
providing a theoretical foundation to optimise purely for prediction in the setting where
training and testing data are not independently and identically distributed.
Formulating robustness in this way suggests that large deep models should, in general,
be more susceptible to robustness issues; while some of these issues have been observed
in applications such as computer vision, it has been less discussed in others. We
investigate the robustness of state-of-the-art deep (SotA) classifiers in human activity
recognition using a new proposed benchmark informed by the causal formulation, and show
that a simpler model is at least as robust as SotA deep models whilst being at least ten
times faster to train. The causal view of robustness additionally hints at the idea that
less data can be beneficial for robustness, contrary to popular belief that more data
is always better. To test this idea, a data selection algorithm is proposed based on
inverting the idea of a popular causal inference procedure for tabular data. The robustness
of a model trained on the selected subset of data is evaluated through synthetic and
semi-synthetic data experiments. Under certain conditions the data subset improves
robustness and subsequently data efficiency.Cambridge Trust and King's Colleg