7 research outputs found
Cross-lingual Transfer Can Worsen Bias in Sentiment Analysis
Sentiment analysis (SA) systems are widely deployed in many of the world's languages, and there is well-documented evidence of demographic bias in these systems. In languages beyond English, scarcer training data is often supplemented with transfer learning using pre-trained models, including multilingual models trained on other languages. In some cases, even supervision data comes from other languages. Does cross-lingual transfer also import new biases? To answer this question, we use counterfactual evaluation to test whether gender or racial biases are imported when using cross-lingual transfer, compared to a monolingual transfer setting. Across five languages, we find that systems using cross-lingual transfer usually become more biased than their monolingual counterparts. We also find racial biases to be much more prevalent than gender biases. To spur further research on this topic, we release the sentiment models we used for this study, and the intermediate checkpoints throughout training, yielding 1,525 distinct models; we also release our evaluation code
Bias Beyond English: Counterfactual Tests for Bias in Sentiment Analysis in Four Languages
Sentiment analysis (SA) systems are used in many products and hundreds of languages. Gender and racial biases are well-studied in English SA systems, but understudied in other languages, with few resources for such studies. To remedy this, we build a counterfactual evaluation corpus for gender and racial/migrant bias in four languages. We demonstrate its usefulness by answering a simple but important question that an engineer might need to answer when deploying a system: What biases do systems import from pre-trained models when compared to a baseline with no pre-training? Our evaluation corpus, by virtue of being counterfactual, not only reveals which models have less bias, but also pinpoints changes in model bias behaviour, which enables more targeted mitigation strategies. We release our code and evaluation corpora to facilitate future research
Fairness in transfer learning for natural language processing
Natural Language Processing (NLP) systems have come to permeate so many areas
of daily life that it is difficult to live a day without having one or many experiences
mediated by an NLP system. These systems bring with them many promises: more
accessible information in more languages, real-time content moderation, more datadriven
decision making, intuitive access to information via Question Answering and
chat interfaces. But there is a dark side to these promises, for the past decade of
research has shown that NLP systems can contain social biases and deploying them can
incur serious social costs. Each of these promises has been found to have unintended
consequences: racially charged errors and rampant gender stereotyping in language
translation, censorship of minority voices and dialects, Human Resource systems that
discriminate based on demographic data, a proliferation of toxic generated text and
misinformation, and many subtler issues.
Yet despite these consequences, and the proliferation of bias research attempting to
correct them, NLP systems have not improved very much. There are a few reasons
for this. First, measuring bias is difficult; there are not standardised methods of measurement,
and much research relies on one-off methods that are often insufficiently
careful and thoroughly tested. Thus many works have contradictory results that cannot
be reconciled, because of minor differences or assumptions in their metrics. Without
thorough testing, these metrics can even mislead and give the illusion of progress.
Second, much research adopts an overly simplistic view of the causes and mediators
of bias in a system. NLP systems have multiple components and stages of training,
and many works test fairness at only one stage. They do not study how different parts
of the system interact, and how fairness changes during this process. So it is unclear
whether these isolated results will hold in the full complex system. Here, we address
both of these shortcomings. We conduct a detailed analysis of fairness metrics applied
to upstream language models (models that will be used in a downstream task in transfer
learning). We find that a) the most commonly used upstream fairness metric is not predictive
of downstream fairness, such that it should not be used but that b) information
theoretic probing is a good alternative to these existing fairness metrics, as we find it is
both predictive of downstream bias and robust to different modelling choices. We then
use our findings to track how unfairness, having entered a system, persists and travels
throughout it. We track how fairness issues travel between tasks (from language modelling
to classification) in monolingual transfer learning, and between languages, in
multilingual transfer learning. We find that multilingual transfer learning often exacerbates
fairness problems and should be used with care, whereas monolingual transfer
learning generally improves fairness. Finally, we track how fairness travels between
source documents and retrieved answers to questions, in fact-based generative systems.
Here we find that, though retrieval systems strongly represent demographic data such
as gender, bias in retrieval question answering benchmarks does not come from the
model representations, but from the queries or the corpora. We reach all of our findings
only by looking at the entire transfer learning system as a whole, and we hope
that this encourages other researchers to do the same. We hope that our results can
guide future fairness research to be more consistent between works, better predictive
of real world fairness outcomes, and better able to prevent unfairness from propagating
between different parts of a system