Search CORE

41 research outputs found

Towards Debiasing Fact Verification Models

Author: Barzilay Regina
Filizzola Daniel
Santus Enrico
Schuster Tal
Shah Darsh J
Yeo Yun Jie Serene
Publication venue
Publication date: 01/01/2019
Field of study

Fact verification requires validating a claim in the context of evidence. We show, however, that in the popular FEVER dataset this might not necessarily be the case. Claim-only classifiers perform competitively with top evidence-aware models. In this paper, we investigate the cause of this phenomenon, identifying strong cues for predicting labels solely based on the claim, without considering any evidence. We create an evaluation set that avoids those idiosyncrasies. The performance of FEVER-trained models significantly drops when evaluated on this test set. Therefore, we introduce a regularization method which alleviates the effect of bias in the training data, obtaining improvements on the newly created test set. This work is a step towards a more sound evaluation of reasoning capabilities in fact verification models.Comment: EMNLP IJCNLP 201

arXiv.org e-Print Archive

DSpace@MIT

Crossref

Proof-checking Bias in Labeling Methods

Author: F. A. D'Asaro
G. Primiero
Publication venue
Publication date: 01/01/2022
Field of study

We introduce a typed natural deduction system designed to formally verify the presence of bias in automatic labeling methods. The system relies on a ”data-as-terms” and ”labels-as-types” interpretation of formulae, with derivability contexts encoding probability distributions on training data. Bias is understood as the divergence that expected probabilistic labeling by a classifier trained on opaque data displays from the fairness constraints set by a transparent dataset

Catalogo dei prodotti della ricerca

Machines do not decide hate speech: Machine learning, power, and the intersectional approach

Author: Kim Jae Yeon
Publication venue: Berlin
Publication date: 01/01/2023
Field of study

The advent of social media has increased digital content - and, with it, hate speech. Advancements in machine learning help detect online hate speech at scale, but scale is only one part of the problem related to moderating it. Machines do not decide what comprises hate speech, which is part of a societal norm. Power relations establish such norms and, thus, determine who can say what comprises hate speech. Without considering this data-generation process, a fair automated hate speech detection system cannot be built. This chapter first examines the relationship between power, hate speech, and machine learning. Then, it examines how the intersectional lens - focusing on power dynamics between and within social groups - helps identify bias in the data sets used to build automated hate speech detection systems

SSOAR - Social Science Open Access Repository