8 research outputs found
Shift happens: how can machine learning systems be best prepared?
Machine learning systems have made headlines in recent years, defeating world champions in Go, enhancing medical diagnoses, and redefining how we work with tools like ChatGPT. However, despite these impressive feats, machine learning systems remain fragile when faced with test data that differs from their training data. This fragility stems from a fundamental mismatch between textbook machine-learning methods and their real-world application. While textbook methods assume that the conditions under which a system is developed are similar to those in which it is deployed, in reality, systems tend to be developed under one set of conditions (e.g., in a lab) and deployed to another (e.g., a clinic). As a result, many machine learning systems are not prepared for the condition differences or distribution shifts they face upon deployment, leading to some high-profile and costly failures. For safety-critical settings like healthcare and autonomous driving, such failures represent a major barrier to real-world deployment.
In this thesis, I argue that we must first accept that shift happens, and subsequently focus on how we can best prepare. To do so, I present four of my works that illustrate how machine learning systems can be prepared for (and adapted to) real-world distribution shifts. Together, these contributions take us closer to reliable machine learning systems that can be deployed in safety-critical settings.
In the first work, the setting is source-free domain adaptation, i.e., adapting a model to unlabelled test data without the original training data. Here, we prepare for a change in measurement device (e.g., X-rays from a different scanner) by storing lightweight statistics of the training data. By restoring these statistics on the test data, we see improved accuracy, calibration and data efficiency over prior methods.
In the second work, the setting is domain generalisation, i.e., performing well on test data from new environments or domains by leveraging data from multiple related domains at training time. Here, we prepare for more flexible and unknown changes by exploiting invariances across the training domains that hold with high probability in unseen test domains. In particular, by minimising a particular quantile of a model's performance distribution over domains, we learn models that perform well with the corresponding probability.
In the third work, the setting is again domain generalisation, but this time we focus on ways to harness so-called "spurious'' features without test-domain labels. In particular, we show that predictions based on invariant/stable features can be used to adapt our usage of spurious/unstable features to new test domains, so long as the stable and unstable features are complementary (i.e., conditionally independent given the label). By safely harnessing complementary spurious features, we boost performance without sacrificing robustness.
Finally, in the fourth work, the setting is disentangled representation learning which, in the context of this thesis, can be viewed as preparing for a change in the task itself by recovering and separating the underlying factors of variation. To this end, we extend an existing evaluation framework by first introducing a measure of representation explicitness or ease of use, and then connecting the framework to identifiability
Learning Object-Centric Representations of Multi-Object Scenes from Multiple Views
Learning object-centric representations of multi-object scenes is a promising
approach towards machine intelligence, facilitating high-level reasoning and
control from visual sensory data. However, current approaches for unsupervised
object-centric scene representation are incapable of aggregating information
from multiple observations of a scene. As a result, these "single-view" methods
form their representations of a 3D scene based only on a single 2D observation
(view). Naturally, this leads to several inaccuracies, with these methods
falling victim to single-view spatial ambiguities. To address this, we propose
The Multi-View and Multi-Object Network (MulMON) -- a method for learning
accurate, object-centric representations of multi-object scenes by leveraging
multiple views. In order to sidestep the main technical difficulty of the
multi-object-multi-view scenario -- maintaining object correspondences across
views -- MulMON iteratively updates the latent object representations for a
scene over multiple views. To ensure that these iterative updates do indeed
aggregate spatial information to form a complete 3D scene understanding, MulMON
is asked to predict the appearance of the scene from novel viewpoints during
training. Through experiments, we show that MulMON better-resolves spatial
ambiguities than single-view methods -- learning more accurate and disentangled
object representations -- and also achieves new functionality in predicting
object segmentations for novel viewpoints.Comment: Accepted at NeurIPS 2020 (Spotlight
Source-Free Adaptation to Measurement Shift via Bottom-Up Feature Restoration
Source-free domain adaptation (SFDA) aims to adapt a model trained on
labelled data in a source domain to unlabelled data in a target domain without
access to the source-domain data during adaptation. Existing methods for SFDA
leverage entropy-minimization techniques which: (i) apply only to
classification; (ii) destroy model calibration; and (iii) rely on the source
model achieving a good level of feature-space class-separation in the target
domain. We address these issues for a particularly pervasive type of domain
shift called measurement shift which can be resolved by restoring the source
features rather than extracting new ones. In particular, we propose Feature
Restoration (FR) wherein we: (i) store a lightweight and flexible approximation
of the feature distribution under the source data; and (ii) adapt the
feature-extractor such that the approximate feature distribution under the
target data realigns with that saved on the source. We additionally propose a
bottom-up training scheme which boosts performance, which we call Bottom-Up
Feature Restoration (BUFR). On real and synthetic data, we demonstrate that
BUFR outperforms existing SFDA methods in terms of accuracy, calibration, and
data efficiency, while being less reliant on the performance of the source
model in the target domain.Comment: ICLR 2022 (Spotlight
Self-Supervised Disentanglement by Leveraging Structure in Data Augmentations
Self-supervised representation learning often uses data augmentations to
induce some invariance to "style" attributes of the data. However, with
downstream tasks generally unknown at training time, it is difficult to deduce
a priori which attributes of the data are indeed "style" and can be safely
discarded. To address this, we introduce a more principled approach that seeks
to disentangle style features rather than discard them. The key idea is to add
multiple style embedding spaces where: (i) each is invariant to all-but-one
augmentation; and (ii) joint entropy is maximized. We formalize our structured
data-augmentation procedure from a causal latent-variable-model perspective,
and prove identifiability of both content and (multiple blocks of) style
variables. We empirically demonstrate the benefits of our approach on synthetic
datasets and then present promising but limited results on ImageNet
Probable Domain Generalization via Quantile Risk Minimization
Domain generalization (DG) seeks predictors which perform well on unseen test
distributions by leveraging data drawn from multiple related training
distributions or domains. To achieve this, DG is commonly formulated as an
average- or worst-case problem over the set of possible domains. However,
predictors that perform well on average lack robustness while predictors that
perform well in the worst case tend to be overly-conservative. To address this,
we propose a new probabilistic framework for DG where the goal is to learn
predictors that perform well with high probability. Our key idea is that
distribution shifts seen during training should inform us of probable shifts at
test time, which we realize by explicitly relating training and test domains as
draws from the same underlying meta-distribution. To achieve probable DG, we
propose a new optimization problem called Quantile Risk Minimization (QRM). By
minimizing the -quantile of predictor's risk distribution over domains,
QRM seeks predictors that perform well with probability . To solve QRM
in practice, we propose the Empirical QRM (EQRM) algorithm and provide: (i) a
generalization bound for EQRM; and (ii) the conditions under which EQRM
recovers the causal predictor as . In our experiments, we
introduce a more holistic quantile-focused evaluation protocol for DG and
demonstrate that EQRM outperforms state-of-the-art baselines on datasets from
WILDS and DomainBed.Comment: NeurIPS 2022 camera-ready (+ minor corrections
Dehydration Influences Mood and Cognition: A Plausible Hypothesis?
The hypothesis was considered that a low fluid intake disrupts cognition and mood. Most research has been carried out on young fit adults, who typically have exercised, often in heat. The results of these studies are inconsistent, preventing any conclusion. Even if the findings had been consistent, confounding variables such as fatigue and increased temperature make it unwise to extrapolate these findings. Thus in young adults there is little evidence that under normal living conditions dehydration disrupts cognition, although this may simply reflect a lack of relevant evidence. There remains the possibility that particular populations are at high risk of dehydration. It is known that renal function declines in many older individuals and thirst mechanisms become less effective. Although there are a few reports that more dehydrated older adults perform cognitive tasks less well, the body of information is limited and there have been little attempt to improve functioning by increasing hydration status. Although children are another potentially vulnerable group that have also been subject to little study, they are the group that has produced the only consistent findings in this area. Four intervention studies have found improved performance in children aged 7 to 9 years. In these studies children, eating and drinking as normal, have been tested on occasions when they have and not have consumed a drink. After a drink both memory and attention have been found to be improved
Align-Deform-Subtract: An Interventional Framework for Explaining Object Differences
Given two object images, how can we explain their differences in terms of the
underlying object properties? To address this question, we propose
Align-Deform-Subtract (ADS) -- an interventional framework for explaining
object differences. By leveraging semantic alignments in image-space as
counterfactual interventions on the underlying object properties, ADS
iteratively quantifies and removes differences in object properties. The result
is a set of "disentangled" error measures which explain object differences in
terms of the underlying properties. Experiments on real and synthetic data
illustrate the efficacy of the framework.Comment: ICLR 2022 Workshop on Objects, Structure and Causalit