Bipol: A Novel Multi-Axes Bias Evaluation Metric with Explainability for
  NLP

Adewumi, Tosin; Alkhaled, Lama; Sabry, Sana Sabah

Bipol: A Novel Multi-Axes Bias Evaluation Metric with Explainability for NLP

Authors: Tosin Adewumi
Lama Alkhaled
Sana Sabah Sabry
Publication date: 8 April 2023
Publisher

Abstract

We introduce bipol, a new metric with explainability, for estimating social bias in text data. Harmful bias is prevalent in many online sources of data that are used for training machine learning (ML) models. In a step to address this challenge we create a novel metric that involves a two-step process: corpus-level evaluation based on model classification and sentence-level evaluation based on (sensitive) term frequency (TF). After creating new models to detect bias along multiple axes using SotA architectures, we evaluate two popular NLP datasets (COPA and SQUAD). As additional contribution, we created a large dataset (with almost 2 million labelled samples) for training models in bias detection and make it publicly available. We also make public our codes.Comment: 12 pages, 4 image

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2304.04029

Last time updated on 14/04/2023