Identifying Gendered Language

Delany, Sarah Jane; Soundararajan, Shweta

Identifying Gendered Language

Authors: Sarah Jane Delany
Shweta Soundararajan
Publication date: 1 January 2023
Publisher: Technological University Dublin
Doi

Abstract

Gendered language refers to the use of words that indicate the gender of an individual. It can be explicit, where the gender is directly implied by the specific words used (e.g., mother, she, man), or it can be implicit, where societal roles and behaviors convey a person\u27s gender. For example, expectations that women display communal traits (e.g., affectionate, caring, gentle) and men display agentic traits (e.g., assertive, competitive, decisive). The presence of gendered language in natural language processing (NLP) systems can reinforce gender stereotypes and bias. Our work introduces an approach to creating gendered language datasets using ChatGPT. These datasets are designed to support data-driven methods for identifying gender stereotypes and mitigating gender bias. The approach focuses on generating implicit gendered language that captures and reflects stereotypical characteristics or traits associated with a specific gender. This is achieved by constructing prompts for ChatGPT that incorporate gender-coded words sourced from gender-coded lexicons. The evaluation of the datasets generated demonstrates good examples of English-language gendered sentences that can be categorized as either contradictory to or consistent with gender stereotypes. Additionally, the generated data exhibits a strong gender bias.https://arrow.tudublin.ie/cddpos/1007/thumbnail.jp

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

Arrow@TUDublin

oai:arrow.tudublin.ie:cddpos-1...

Last time updated on 13/08/2023