646 research outputs found

    Methodological issues in international segmentation

    Get PDF

    Automatic Fact-guided Sentence Modification

    Full text link
    Online encyclopediae like Wikipedia contain large amounts of text that need frequent corrections and updates. The new information may contradict existing content in encyclopediae. In this paper, we focus on rewriting such dynamically changing articles. This is a challenging constrained generation task, as the output must be consistent with the new information and fit into the rest of the existing document. To this end, we propose a two-step solution: (1) We identify and remove the contradicting components in a target text for a given claim, using a neutralizing stance model; (2) We expand the remaining text to be consistent with the given claim, using a novel two-encoder sequence-to-sequence model with copy attention. Applied to a Wikipedia fact update dataset, our method successfully generates updated sentences for new claims, achieving the highest SARI score. Furthermore, we demonstrate that generating synthetic data through such rewritten sentences can successfully augment the FEVER fact-checking training dataset, leading to a relative error reduction of 13%.Comment: AAAI 202

    Outreach: Change the World, Change Yourself

    Get PDF

    The Limitations of Stylometry for Detecting Machine-Generated Fake News

    Full text link
    Recent developments in neural language models (LMs) have raised concerns about their potential misuse for automatically spreading misinformation. In light of these concerns, several studies have proposed to detect machine-generated fake news by capturing their stylistic differences from human-written text. These approaches, broadly termed stylometry, have found success in source attribution and misinformation detection in human-written texts. However, in this work, we show that stylometry is limited against machine-generated misinformation. While humans speak differently when trying to deceive, LMs generate stylistically consistent text, regardless of underlying motive. Thus, though stylometry can successfully prevent impersonation by identifying text provenance, it fails to distinguish legitimate LM applications from those that introduce false information. We create two benchmarks demonstrating the stylistic similarity between malicious and legitimate uses of LMs, employed in auto-completion and editing-assistance settings. Our findings highlight the need for non-stylometry approaches in detecting machine-generated misinformation, and open up the discussion on the desired evaluation benchmarks.Comment: Accepted for Computational Linguistics journal (squib). Previously posted with title "Are We Safe Yet? The Limitations of Distributional Features for Fake News Detection

    The Use of Software Design Patterns to Teach Secure Software Design: An Integrated Approach

    Get PDF
    Part 2: Software Security EducationInternational audienceDuring software development, security is often dealt with as an add-on. This means that security considerations are not necessarily seen as an integral part of the overall solution and might even be left out of a design. For many security problems, the approach towards secure development has recurring elements. Software design patterns are often used to address a commonly occurring problem through a “generic” approach towards this problem. The design pattern provides a conceptual model of a best-practices solution, which in turn is used by developers to create a concrete implementation for their specific problem. Most software design patterns do not include security best-practices as part of the generic solution towards the commonly occurring problem. This paper proposes an extension to the widely used MVC pattern that includes current security principles in order to teach secure software design in an integrated fashion

    Towards Debiasing Fact Verification Models

    Full text link
    Fact verification requires validating a claim in the context of evidence. We show, however, that in the popular FEVER dataset this might not necessarily be the case. Claim-only classifiers perform competitively with top evidence-aware models. In this paper, we investigate the cause of this phenomenon, identifying strong cues for predicting labels solely based on the claim, without considering any evidence. We create an evaluation set that avoids those idiosyncrasies. The performance of FEVER-trained models significantly drops when evaluated on this test set. Therefore, we introduce a regularization method which alleviates the effect of bias in the training data, obtaining improvements on the newly created test set. This work is a step towards a more sound evaluation of reasoning capabilities in fact verification models.Comment: EMNLP IJCNLP 201
    • …
    corecore