41 research outputs found
Towards Debiasing Fact Verification Models
Fact verification requires validating a claim in the context of evidence. We
show, however, that in the popular FEVER dataset this might not necessarily be
the case. Claim-only classifiers perform competitively with top evidence-aware
models. In this paper, we investigate the cause of this phenomenon, identifying
strong cues for predicting labels solely based on the claim, without
considering any evidence. We create an evaluation set that avoids those
idiosyncrasies. The performance of FEVER-trained models significantly drops
when evaluated on this test set. Therefore, we introduce a regularization
method which alleviates the effect of bias in the training data, obtaining
improvements on the newly created test set. This work is a step towards a more
sound evaluation of reasoning capabilities in fact verification models.Comment: EMNLP IJCNLP 201
Proof-checking Bias in Labeling Methods
We introduce a typed natural deduction system designed to formally verify the presence of bias in automatic labeling methods. The system relies on a ”data-as-terms” and ”labels-as-types” interpretation of formulae, with derivability contexts encoding probability distributions on training data. Bias is understood as the divergence that expected probabilistic labeling by a classifier trained on opaque data displays from the fairness constraints set by a transparent dataset
Machines do not decide hate speech: Machine learning, power, and the intersectional approach
The advent of social media has increased digital content - and, with it, hate speech. Advancements in machine learning help detect online hate speech at scale, but scale is only one part of the problem related to moderating it. Machines do not decide what comprises hate speech, which is part of a societal norm. Power relations establish such norms and, thus, determine who can say what comprises hate speech. Without considering this data-generation process, a fair automated hate speech detection system cannot be built. This chapter first examines the relationship between power, hate speech, and machine learning. Then, it examines how the intersectional lens - focusing on power dynamics between and within social groups - helps identify bias in the data sets used to build automated hate speech detection systems