Search CORE

5,411 research outputs found

Mitigating Gender Bias in Machine Learning Data Sets

Author: A Caliskan
CR Caldas-Coulthard
E Shor
GM Vefali
H Motschenbacher
K Frith
M Pearce
N Garg
P Ingham
R Sigley
S Leavy
S Mills
S Mollin
S Romaine
W Martyna
Publication venue
Publication date: 18/05/2020
Field of study

Artificial Intelligence has the capacity to amplify and perpetuate societal biases and presents profound ethical implications for society. Gender bias has been identified in the context of employment advertising and recruitment tools, due to their reliance on underlying language processing and recommendation algorithms. Attempts to address such issues have involved testing learned associations, integrating concepts of fairness to machine learning and performing more rigorous analysis of training data. Mitigating bias when algorithms are trained on textual data is particularly challenging given the complex way gender ideology is embedded in language. This paper proposes a framework for the identification of gender bias in training data for machine learning.The work draws upon gender theory and sociolinguistics to systematically indicate levels of bias in textual training data and associated neural word embedding models, thus highlighting pathways for both removing bias from training data and critically assessing its impact.Comment: 10 pages, 5 figures, 5 Tables, Presented as Bias2020 workshop (as part of the ECIR Conference) - http://bias.disim.univaq.i

arXiv.org e-Print Archive

Crossref

Research Repository UCD

Directional Pairwise Class Confusion Bias and Its Mitigation

Author: Aygun Ramazan, PhD
Boardman Jonathan
Don Duleep Prasanna Rathgamage
Franks Bill
Johnston Sereres, PhD
Lee George
Modgil Girish, PhD
Sayenju Sudhashree
Sullivan Dan
Zhang Yifan, PhD
Publication venue: DigitalCommons@Kennesaw State University
Publication date: 01/03/2022
Field of study

Recent advances in Natural Language Processing have led to powerful and sophisticated models like BERT (Bidirectional Encoder Representations from Transformers) that have bias. These models are mostly trained on text corpora that deviate in important ways from the text encountered by a chatbot in a problem-specific context. While a lot of research in the past has focused on measuring and mitigating bias with respect to protected attributes (stereotyping like gender, race, ethnicity, etc.), there is lack of research in model bias with respect to classification labels. We investigate whether a classification model hugely favors one class with respect to another. We introduce a bias evaluation method called directional pairwise class confusion bias that highlights the chatbot intent classification model’s bias on pairs of classes. Finally, we also present two strategies to mitigate this bias using example biased pairs

DigitalCommons@Kennesaw State University

Mitigating Bias in Conversations: A Hate Speech Classifier and Debiaser with Prompts

Author: Ding Chen
Pandya Deval
Raza Shaina
Publication venue
Publication date: 14/07/2023
Field of study

Discriminatory language and biases are often present in hate speech during conversations, which usually lead to negative impacts on targeted groups such as those based on race, gender, and religion. To tackle this issue, we propose an approach that involves a two-step process: first, detecting hate speech using a classifier, and then utilizing a debiasing component that generates less biased or unbiased alternatives through prompts. We evaluated our approach on a benchmark dataset and observed reduction in negativity due to hate speech comments. The proposed method contributes to the ongoing efforts to reduce biases in online discourse and promote a more inclusive and fair environment for communication.Comment: Accepted KDD - Data Science for Social Goo

arXiv.org e-Print Archive