Relabeling Minimal Training Subset to Flip a Prediction

Xu, Linjie; Yang, Jinghan; Yu, Lequan

Relabeling Minimal Training Subset to Flip a Prediction

Authors: Linjie Xu
Jinghan Yang
Lequan Yu
Publication date: 28 June 2023
Publisher

Abstract

When facing an unsatisfactory prediction from a machine learning model, it is crucial to investigate the underlying reasons and explore the potential for reversing the outcome. We ask: can we result in the flipping of a test prediction

x_t

by relabeling the smallest subset

\mathcal{S}_t

of the training data before the model is trained? We propose an efficient procedure to identify and relabel such a subset via an extended influence function. We find that relabeling fewer than 1% of the training points can often flip the model's prediction. This mechanism can serve multiple purposes: (1) providing an approach to challenge a model prediction by recovering influential training subsets; (2) evaluating model robustness with the cardinality of the subset (i.e.,

|\mathcal{S}_t|

); we show that

|\mathcal{S}_t|

is highly related to the noise ratio in the training set and

|\mathcal{S}_t|

is correlated with but complementary to predicted probabilities; (3) revealing training points lead to group attribution bias. To the best of our knowledge, we are the first to investigate identifying and relabeling the minimal training subset required to flip a given prediction.Comment: Under revie

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2305.12809

Last time updated on 24/05/2023