Identifying Gendered Language

Abstract

Gendered language refers to the use of words that indicate the gender of an individual. It can be explicit, where the gender is directly implied by the specific words used (e.g., mother, she, man), or it can be implicit, where societal roles and behaviors convey a person\u27s gender. For example, expectations that women display communal traits (e.g., affectionate, caring, gentle) and men display agentic traits (e.g., assertive, competitive, decisive). The presence of gendered language in natural language processing (NLP) systems can reinforce gender stereotypes and bias. Our work introduces an approach to creating gendered language datasets using ChatGPT. These datasets are designed to support data-driven methods for identifying gender stereotypes and mitigating gender bias. The approach focuses on generating implicit gendered language that captures and reflects stereotypical characteristics or traits associated with a specific gender. This is achieved by constructing prompts for ChatGPT that incorporate gender-coded words sourced from gender-coded lexicons. The evaluation of the datasets generated demonstrates good examples of English-language gendered sentences that can be categorized as either contradictory to or consistent with gender stereotypes. Additionally, the generated data exhibits a strong gender bias.https://arrow.tudublin.ie/cddpos/1007/thumbnail.jp

    Similar works