Identifying Spurious Biases Early in Training through the Lens of
  Simplicity Bias

Dziugaite, Gintare Karolina; Gan, Eric; Mirzasoleiman, Baharan; Yang, Yu

Identifying Spurious Biases Early in Training through the Lens of Simplicity Bias

Authors: Gintare Karolina Dziugaite
Eric Gan
Baharan Mirzasoleiman
Yu Yang
Publication date: 30 May 2023
Publisher

Abstract

Neural networks trained with (stochastic) gradient descent have an inductive bias towards learning simpler solutions. This makes them highly prone to learning simple spurious features that are highly correlated with a label instead of the predictive but more complex core features. In this work, we show that, interestingly, the simplicity bias of gradient descent can be leveraged to identify spurious correlations, early in training. First, we prove on a two-layer neural network, that groups of examples with high spurious correlation are separable based on the model's output, in the initial training iterations. We further show that if spurious features have a small enough noise-to-signal ratio, the network's output on the majority of examples in a class will be almost exclusively determined by the spurious features and will be nearly invariant to the core feature. Finally, we propose SPARE, which separates large groups with spurious correlations early in training, and utilizes importance sampling to alleviate the spurious correlation, by balancing the group sizes. We show that SPARE achieves up to 5.6% higher worst-group accuracy than state-of-the-art methods, while being up to 12x faster. We also show the applicability of SPARE to discover and mitigate spurious correlations in Restricted ImageNet

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2305.18761

Last time updated on 02/06/2023