7 research outputs found

    Cross-lingual Transfer Can Worsen Bias in Sentiment Analysis

    Get PDF
    Sentiment analysis (SA) systems are widely deployed in many of the world's languages, and there is well-documented evidence of demographic bias in these systems. In languages beyond English, scarcer training data is often supplemented with transfer learning using pre-trained models, including multilingual models trained on other languages. In some cases, even supervision data comes from other languages. Does cross-lingual transfer also import new biases? To answer this question, we use counterfactual evaluation to test whether gender or racial biases are imported when using cross-lingual transfer, compared to a monolingual transfer setting. Across five languages, we find that systems using cross-lingual transfer usually become more biased than their monolingual counterparts. We also find racial biases to be much more prevalent than gender biases. To spur further research on this topic, we release the sentiment models we used for this study, and the intermediate checkpoints throughout training, yielding 1,525 distinct models; we also release our evaluation code

    Bias Beyond English: Counterfactual Tests for Bias in Sentiment Analysis in Four Languages

    Get PDF
    Sentiment analysis (SA) systems are used in many products and hundreds of languages. Gender and racial biases are well-studied in English SA systems, but understudied in other languages, with few resources for such studies. To remedy this, we build a counterfactual evaluation corpus for gender and racial/migrant bias in four languages. We demonstrate its usefulness by answering a simple but important question that an engineer might need to answer when deploying a system: What biases do systems import from pre-trained models when compared to a baseline with no pre-training? Our evaluation corpus, by virtue of being counterfactual, not only reveals which models have less bias, but also pinpoints changes in model bias behaviour, which enables more targeted mitigation strategies. We release our code and evaluation corpora to facilitate future research

    Fairness in transfer learning for natural language processing

    Get PDF
    Natural Language Processing (NLP) systems have come to permeate so many areas of daily life that it is difficult to live a day without having one or many experiences mediated by an NLP system. These systems bring with them many promises: more accessible information in more languages, real-time content moderation, more datadriven decision making, intuitive access to information via Question Answering and chat interfaces. But there is a dark side to these promises, for the past decade of research has shown that NLP systems can contain social biases and deploying them can incur serious social costs. Each of these promises has been found to have unintended consequences: racially charged errors and rampant gender stereotyping in language translation, censorship of minority voices and dialects, Human Resource systems that discriminate based on demographic data, a proliferation of toxic generated text and misinformation, and many subtler issues. Yet despite these consequences, and the proliferation of bias research attempting to correct them, NLP systems have not improved very much. There are a few reasons for this. First, measuring bias is difficult; there are not standardised methods of measurement, and much research relies on one-off methods that are often insufficiently careful and thoroughly tested. Thus many works have contradictory results that cannot be reconciled, because of minor differences or assumptions in their metrics. Without thorough testing, these metrics can even mislead and give the illusion of progress. Second, much research adopts an overly simplistic view of the causes and mediators of bias in a system. NLP systems have multiple components and stages of training, and many works test fairness at only one stage. They do not study how different parts of the system interact, and how fairness changes during this process. So it is unclear whether these isolated results will hold in the full complex system. Here, we address both of these shortcomings. We conduct a detailed analysis of fairness metrics applied to upstream language models (models that will be used in a downstream task in transfer learning). We find that a) the most commonly used upstream fairness metric is not predictive of downstream fairness, such that it should not be used but that b) information theoretic probing is a good alternative to these existing fairness metrics, as we find it is both predictive of downstream bias and robust to different modelling choices. We then use our findings to track how unfairness, having entered a system, persists and travels throughout it. We track how fairness issues travel between tasks (from language modelling to classification) in monolingual transfer learning, and between languages, in multilingual transfer learning. We find that multilingual transfer learning often exacerbates fairness problems and should be used with care, whereas monolingual transfer learning generally improves fairness. Finally, we track how fairness travels between source documents and retrieved answers to questions, in fact-based generative systems. Here we find that, though retrieval systems strongly represent demographic data such as gender, bias in retrieval question answering benchmarks does not come from the model representations, but from the queries or the corpora. We reach all of our findings only by looking at the entire transfer learning system as a whole, and we hope that this encourages other researchers to do the same. We hope that our results can guide future fairness research to be more consistent between works, better predictive of real world fairness outcomes, and better able to prevent unfairness from propagating between different parts of a system
    corecore