Survey on Sociodemographic Bias in Natural Language Processing

Gupta, Vipul; Passonneau, Rebecca J.; Venkit, Pranav Narayanan; Wilson, Shomir

Survey on Sociodemographic Bias in Natural Language Processing

Authors: Vipul Gupta
Rebecca J. Passonneau
Pranav Narayanan Venkit
Shomir Wilson
Publication date: 26 June 2023
Publisher

Abstract

Deep neural networks often learn unintended biases during training, which might have harmful effects when deployed in real-world settings. This paper surveys 209 papers on bias in NLP models, most of which address sociodemographic bias. To better understand the distinction between bias and real-world harm, we turn to ideas from psychology and behavioral economics to propose a definition for sociodemographic bias. We identify three main categories of NLP bias research: types of bias, quantifying bias, and debiasing. We conclude that current approaches on quantifying bias face reliability issues, that many of the bias metrics do not relate to real-world biases, and that current debiasing techniques are superficial and hide bias rather than removing it. Finally, we provide recommendations for future work.Comment: 23 pages, 1 figur

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2306.08158

Last time updated on 02/07/2023