4,369 research outputs found

    Do Neural Ranking Models Intensify Gender Bias?

    Full text link
    Concerns regarding the footprint of societal biases in information retrieval (IR) systems have been raised in several previous studies. In this work, we examine various recent IR models from the perspective of the degree of gender bias in their retrieval results. To this end, we first provide a bias measurement framework which includes two metrics to quantify the degree of the unbalanced presence of gender-related concepts in a given IR model's ranking list. To examine IR models by means of the framework, we create a dataset of non-gendered queries, selected by human annotators. Applying these queries to the MS MARCO Passage retrieval collection, we then measure the gender bias of a BM25 model and several recent neural ranking models. The results show that while all models are strongly biased toward male, the neural models, and in particular the ones based on contextualized embedding models, significantly intensify gender bias. Our experiments also show an overall increase in the gender bias of neural models when they exploit transfer learning, namely when they use (already biased) pre-trained embeddings.Comment: In Proceedings of ACM SIGIR 202

    Evaluation metrics for measuring bias in search engine results

    Get PDF
    Search engines decide what we see for a given search query. Since many people are exposed to information through search engines, it is fair to expect that search engines are neutral. However, search engine results do not necessarily cover all the viewpoints of a search query topic, and they can be biased towards a specific view since search engine results are returned based on relevance, which is calculated using many features and sophisticated algorithms where search neutrality is not necessarily the focal point. Therefore, it is important to evaluate the search engine results with respect to bias. In this work we propose novel web search bias evaluation measures which take into account the rank and relevance. We also propose a framework to evaluate web search bias using the proposed measures and test our framework on two popular search engines based on 57 controversial query topics such as abortion, medical marijuana, and gay marriage. We measure the stance bias (in support or against), as well as the ideological bias (conservative or liberal). We observe that the stance does not necessarily correlate with the ideological leaning, e.g. a positive stance on abortion indicates a liberal leaning but a positive stance on Cuba embargo indicates a conservative leaning. Our experiments show that neither of the search engines suffers from stance bias. However, both search engines suffer from ideological bias, both favouring one ideological leaning to the other, which is more significant from the perspective of polarisation in our society

    "Foreign beauties want to meet you": The sexualization of women in Google's organic and sponsored text search results

    Get PDF
    Search engines serve as information gatekeepers on a multitude of topics dealing with different aspects of society. However, the ways search engines filter and rank information are prone to biases related to gender, ethnicity, and race. In this article, we conduct a systematic algorithm audit to examine how one specific form of bias, namely, sexualization, is manifested in Google’s text search results about different national and gender groups. We find evidence of the sexualization of women, particularly those from the Global South and East, in search outputs in both organic and sponsored search results. Our findings contribute to research on the sexualization of people in different forms of media, bias in web search, and algorithm auditing as well as have important implications for the ongoing debates about the responsibility of transnational tech companies for preventing systems they design from amplifying discrimination

    Gender Stereotype Reinforcement: Measuring the Gender Bias Conveyed by Ranking Algorithms

    Full text link
    Search Engines (SE) have been shown to perpetuate well-known gender stereotypes identified in psychology literature and to influence users accordingly. Similar biases were found encoded in Word Embeddings (WEs) learned from large online corpora. In this context, we propose the Gender Stereotype Reinforcement (GSR) measure, which quantifies the tendency of a SE to support gender stereotypes, leveraging gender-related information encoded in WEs. Through the critical lens of construct validity, we validate the proposed measure on synthetic and real collections. Subsequently, we use GSR to compare widely-used Information Retrieval ranking algorithms, including lexical, semantic, and neural models. We check if and how ranking algorithms based on WEs inherit the biases of the underlying embeddings. We also consider the most common debiasing approaches for WEs proposed in the literature and test their impact in terms of GSR and common performance measures. To the best of our knowledge, GSR is the first specifically tailored measure for IR, capable of quantifying representational harms.Comment: To appear in Information Processing & Managemen

    Negligent Algorithmic Discrimination

    Get PDF

    Degendering Resumes for Fair Algorithmic Resume Screening

    Full text link
    We investigate whether it is feasible to remove gendered information from resumes to mitigate potential bias in algorithmic resume screening. Using a corpus of 709k resumes from IT firms, we first train a series of models to classify the self-reported gender of the applicant, thereby measuring the extent and nature of gendered information encoded in resumes. We then conduct a series of gender obfuscation experiments, where we iteratively remove gendered information from resumes. Finally, we train a resume screening algorithm and investigate the trade-off between gender obfuscation and screening algorithm performance. Results show: (1) There is a significant amount of gendered information in resumes. (2) Lexicon-based gender obfuscation method (i.e. removing tokens that are predictive of gender) can reduce the amount of gendered information to a large extent. However, after a certain point, the performance of the resume screening algorithm starts suffering. (3) General-purpose gender debiasing methods for NLP models such as removing gender subspace from embeddings are not effective in obfuscating gender.Comment: Non

    AI and Inequality in Hiring and Recruiting: A Field Scan

    Get PDF
    This paper provides a field scan of scholarly work on AI and hiring. It addresses the issue that there still is no comprehensive understanding of how technical, social science, and managerial scholarships around AI bias, recruiting, and inequality in the labor market intersect, particularly vis-à-vis the STEM field. It reports on a semi-systematic literature review and identifies three overlapping meta themes: productivity, gender, and AI bias. It critically discusses these themes and makes recommendations for future work
    corecore