2,291 research outputs found
A general framework for implicit and explicit debiasing of distributional word vector spaces
Distributional word vectors have recently been shown to encode many of the human biases, most notably gender and racial biases, and models for attenuating such biases have consequently been proposed. However, existing models and studies (1) operate on under-specified and mutually differing bias definitions, (2) are tailored for a particular bias (e.g., gender bias) and (3) have been evaluated inconsistently and non-rigorously. In this work, we introduce a general framework for debiasing word embeddings. We operationalize the definition of a bias by discerning two types of bias specification: explicit and implicit. We then propose three debiasing models that operate on explicit or implicit bias specifications and that can be composed towards more robust debiasing. Finally, we devise a full-fledged evaluation framework in which we couple existing bias metrics with newly proposed ones. Experimental findings across three embedding methods suggest that the proposed debiasing models are robust and widely applicable: they often completely remove the bias both implicitly and explicitly without degradation of semantic information encoded in any of the input distributional spaces. Moreover, we successfully transfer debiasing models, by means of cross-lingual embedding spaces, and remove or attenuate biases in distributional word vector spaces of languages that lack readily available bias specifications
On Measuring and Mitigating Biased Inferences of Word Embeddings
Word embeddings carry stereotypical connotations from the text they are
trained on, which can lead to invalid inferences in downstream models that rely
on them. We use this observation to design a mechanism for measuring
stereotypes using the task of natural language inference. We demonstrate a
reduction in invalid inferences via bias mitigation strategies on static word
embeddings (GloVe). Further, we show that for gender bias, these techniques
extend to contextualized embeddings when applied selectively only to the static
components of contextualized embeddings (ELMo, BERT)
DebIE: A Platform for Implicit and Explicit Debiasing of Word Embedding Spaces
Recent research efforts in NLP have demonstrated that distributional word
vector spaces often encode stereotypical human biases, such as racism and
sexism. With word representations ubiquitously used in NLP models and
pipelines, this raises ethical issues and jeopardizes the fairness of language
technologies. While there exists a large body of work on bias measures and
debiasing methods, to date, there is no platform that would unify these
research efforts and make bias measuring and debiasing of representation spaces
widely accessible. In this work, we present DebIE, the first integrated
platform for (1) measuring and (2) mitigating bias in word embeddings. Given an
(i) embedding space (users can choose between the predefined spaces or upload
their own) and (ii) a bias specification (users can choose between existing
bias specifications or create their own), DebIE can (1) compute several
measures of implicit and explicit bias and modify the embedding space by
executing two (mutually composable) debiasing models. DebIE's functionality can
be accessed through four different interfaces: (a) a web application, (b) a
desktop application, (c) a REST-ful API, and (d) as a command-line application.
DebIE is available at: debie.informatik.uni-mannheim.de.Comment: Accepted as EACL21 Dem
An Empirical Study on the Fairness of Pre-trained Word Embeddings
Pre-trained word embedding models are easily distributed and applied, as they alleviate
users from the effort to train models themselves.
With widely distributed models, it is important to ensure that they do not exhibit undesired behaviour, such as biases against population groups. For this purpose, we carry out
an empirical study on evaluating the bias of
15 publicly available, pre-trained word embeddings model based on three training algorithms
(GloVe, word2vec, and fastText) with
regard to four bias metrics (WEAT, SEMBIAS,
DIRECT BIAS, and ECT). The choice of word
embedding models and bias metrics is motivated by a literature survey over 37 publications
which quantified bias on pre-trained word embeddings. Our results indicate that fastText
is the least biased model (in 8 out of 12 cases)
and small vector lengths lead to a higher bias
- …