72 research outputs found
Recommended from our members
Sociolinguistically Driven Approaches for Just Natural Language Processing
Natural language processing (NLP) systems are now ubiquitous. Yet the benefits of these language technologies do not accrue evenly to all users, and indeed they can be harmful; NLP systems reproduce stereotypes, prevent speakers of non-standard language varieties from participating fully in public discourse, and re-inscribe historical patterns of linguistic stigmatization and discrimination. How harms arise in NLP systems, and who is harmed by them, can only be understood at the intersection of work on NLP, fairness and justice in machine learning, and the relationships between language and social justice. In this thesis, we propose to address two questions at this intersection: i) How can we conceptualize harms arising from NLP systems?, and ii) How can we quantify such harms?
We propose the following contributions. First, we contribute a model in order to collect the first large dataset of African American Language (AAL)-like social media text. We use the dataset to quantify the performance of two types of NLP systems, identifying disparities in model performance between Mainstream U.S. English (MUSE)- and AAL-like text. Turning to the landscape of bias in NLP more broadly, we then provide a critical survey of the emerging literature on bias in NLP and identify its limitations. Drawing on work across sociology, sociolinguistics, linguistic anthropology, social psychology, and education, we provide an account of the relationships between language and injustice, propose a taxonomy of harms arising from NLP systems grounded in those relationships, and propose a set of guiding research questions for work on bias in NLP. Finally, we adapt the measurement modeling framework from the quantitative social sciences to effectively evaluate approaches for quantifying bias in NLP systems. We conclude with a discussion of recent work on bias through the lens of style in NLP, raising a set of normative questions for future work
“Organically German”?:Changing ideologies of national belonging
This chapter examines variation in the situated meanings of the term Biodeutsche(r), a term which has emerged relatively recently as a way to refer to people who are German by descent (i.e., not of migration background). This analysis shows that use of this term reflects competing discourses about the role of ethnicity in national belonging in Germany. While the origin and many uses of the term challenge the validity of ethnicity as a basis for legitimacy in German society, some of the data suggest that it has also been adopted as a supposedly neutral term to describe a segment of the German population, which supports an ethnonational ideology
- …