41 research outputs found

    Towards Debiasing Fact Verification Models

    Full text link
    Fact verification requires validating a claim in the context of evidence. We show, however, that in the popular FEVER dataset this might not necessarily be the case. Claim-only classifiers perform competitively with top evidence-aware models. In this paper, we investigate the cause of this phenomenon, identifying strong cues for predicting labels solely based on the claim, without considering any evidence. We create an evaluation set that avoids those idiosyncrasies. The performance of FEVER-trained models significantly drops when evaluated on this test set. Therefore, we introduce a regularization method which alleviates the effect of bias in the training data, obtaining improvements on the newly created test set. This work is a step towards a more sound evaluation of reasoning capabilities in fact verification models.Comment: EMNLP IJCNLP 201

    Proof-checking Bias in Labeling Methods

    Get PDF
    We introduce a typed natural deduction system designed to formally verify the presence of bias in automatic labeling methods. The system relies on a ”data-as-terms” and ”labels-as-types” interpretation of formulae, with derivability contexts encoding probability distributions on training data. Bias is understood as the divergence that expected probabilistic labeling by a classifier trained on opaque data displays from the fairness constraints set by a transparent dataset

    Machines do not decide hate speech: Machine learning, power, and the intersectional approach

    Get PDF
    The advent of social media has increased digital content - and, with it, hate speech. Advancements in machine learning help detect online hate speech at scale, but scale is only one part of the problem related to moderating it. Machines do not decide what comprises hate speech, which is part of a societal norm. Power relations establish such norms and, thus, determine who can say what comprises hate speech. Without considering this data-generation process, a fair automated hate speech detection system cannot be built. This chapter first examines the relationship between power, hate speech, and machine learning. Then, it examines how the intersectional lens - focusing on power dynamics between and within social groups - helps identify bias in the data sets used to build automated hate speech detection systems
    corecore