5,411 research outputs found
Mitigating Gender Bias in Machine Learning Data Sets
Artificial Intelligence has the capacity to amplify and perpetuate societal
biases and presents profound ethical implications for society. Gender bias has
been identified in the context of employment advertising and recruitment tools,
due to their reliance on underlying language processing and recommendation
algorithms. Attempts to address such issues have involved testing learned
associations, integrating concepts of fairness to machine learning and
performing more rigorous analysis of training data. Mitigating bias when
algorithms are trained on textual data is particularly challenging given the
complex way gender ideology is embedded in language. This paper proposes a
framework for the identification of gender bias in training data for machine
learning.The work draws upon gender theory and sociolinguistics to
systematically indicate levels of bias in textual training data and associated
neural word embedding models, thus highlighting pathways for both removing bias
from training data and critically assessing its impact.Comment: 10 pages, 5 figures, 5 Tables, Presented as Bias2020 workshop (as
part of the ECIR Conference) - http://bias.disim.univaq.i
Directional Pairwise Class Confusion Bias and Its Mitigation
Recent advances in Natural Language Processing have led to powerful and sophisticated models like BERT (Bidirectional Encoder Representations from Transformers) that have bias. These models are mostly trained on text corpora that deviate in important ways from the text encountered by a chatbot in a problem-specific context. While a lot of research in the past has focused on measuring and mitigating bias with respect to protected attributes (stereotyping like gender, race, ethnicity, etc.), there is lack of research in model bias with respect to classification labels. We investigate whether a classification model hugely favors one class with respect to another. We introduce a bias evaluation method called directional pairwise class confusion bias that highlights the chatbot intent classification model’s bias on pairs of classes. Finally, we also present two strategies to mitigate this bias using example biased pairs
Mitigating Bias in Conversations: A Hate Speech Classifier and Debiaser with Prompts
Discriminatory language and biases are often present in hate speech during
conversations, which usually lead to negative impacts on targeted groups such
as those based on race, gender, and religion. To tackle this issue, we propose
an approach that involves a two-step process: first, detecting hate speech
using a classifier, and then utilizing a debiasing component that generates
less biased or unbiased alternatives through prompts. We evaluated our approach
on a benchmark dataset and observed reduction in negativity due to hate speech
comments. The proposed method contributes to the ongoing efforts to reduce
biases in online discourse and promote a more inclusive and fair environment
for communication.Comment: Accepted KDD - Data Science for Social Goo
- …