Search CORE

10 research outputs found

The Limitations of Stylometry for Detecting Machine-Generated Fake News

Author: Barzilay Regina
Schuster Roei
Schuster Tal
Shah Darsh J
Publication venue
Publication date: 20/02/2020
Field of study

Recent developments in neural language models (LMs) have raised concerns about their potential misuse for automatically spreading misinformation. In light of these concerns, several studies have proposed to detect machine-generated fake news by capturing their stylistic differences from human-written text. These approaches, broadly termed stylometry, have found success in source attribution and misinformation detection in human-written texts. However, in this work, we show that stylometry is limited against machine-generated misinformation. While humans speak differently when trying to deceive, LMs generate stylistically consistent text, regardless of underlying motive. Thus, though stylometry can successfully prevent impersonation by identifying text provenance, it fails to distinguish legitimate LM applications from those that introduce false information. We create two benchmarks demonstrating the stylistic similarity between malicious and legitimate uses of LMs, employed in auto-completion and editing-assistance settings. Our findings highlight the need for non-stylometry approaches in detecting machine-generated misinformation, and open up the discussion on the desired evaluation benchmarks.Comment: Accepted for Computational Linguistics journal (squib). Previously posted with title "Are We Safe Yet? The Limitations of Distributional Features for Fake News Detection

arXiv.org e-Print Archive

DSpace@MIT

Identifying and Mitigating the Security Risks of Generative AI

Author: Barrett Clark
Boyd Brad
Burzstein Ellie
Carlini Nicholas
Chen Brad
Choi Jihye
Chowdhury Amrita Roy
Christodorescu Mihai
Datta Anupam
Feizi Soheil
Fisher Kathleen
Hashimoto Tatsunori
Hendrycks Dan
Jha Somesh
Kang Daniel
Kerschbaum Florian
Mitchell Eric
Mitchell John
Ramzan Zulfikar
Shams Khawaja
Song Dawn
Taly Ankur
Yang Diyi
Publication venue
Publication date: 28/08/2023
Field of study

Every major technical invention resurfaces the dual-use dilemma -- the new technology has the potential to be used for good as well as for harm. Generative AI (GenAI) techniques, such as large language models (LLMs) and diffusion models, have shown remarkable capabilities (e.g., in-context learning, code-completion, and text-to-image generation and editing). However, GenAI can be used just as well by attackers to generate new attacks and increase the velocity and efficacy of existing attacks. This paper reports the findings of a workshop held at Google (co-organized by Stanford University and the University of Wisconsin-Madison) on the dual-use dilemma posed by GenAI. This paper is not meant to be comprehensive, but is rather an attempt to synthesize some of the interesting findings from the workshop. We discuss short-term and long-term goals for the community on this topic. We hope this paper provides both a launching point for a discussion on this important topic as well as interesting problems that the research community can work to address

arXiv.org e-Print Archive

Machine Generated Text: A Comprehensive Survey of Threat Models and Detection Methods

Author: Crothers Evan
Japkowicz Nathalie
Viktor Herna
Publication venue
Publication date: 15/02/2023
Field of study

Machine generated text is increasingly difficult to distinguish from human authored text. Powerful open-source models are freely available, and user-friendly tools that democratize access to generative models are proliferating. ChatGPT, which was released shortly after the first preprint of this survey, epitomizes these trends. The great potential of state-of-the-art natural language generation (NLG) systems is tempered by the multitude of avenues for abuse. Detection of machine generated text is a key countermeasure for reducing abuse of NLG models, with significant technical challenges and numerous open problems. We provide a survey that includes both 1) an extensive analysis of threat models posed by contemporary NLG systems, and 2) the most complete review of machine generated text detection methods to date. This survey places machine generated text within its cybersecurity and social context, and provides strong guidance for future work addressing the most critical threat models, and ensuring detection systems themselves demonstrate trustworthiness through fairness, robustness, and accountability.Comment: Manuscript submitted to ACM Special Session on Trustworthy AI. 2022/11/19 - Updated reference

arXiv.org e-Print Archive

Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

Author: Giulianelli M.
Jumelet J.
Schubert M.
Shutova E.
Siro C.
Srivastava A.
ter Hoeve M.
Tong X.
Publication venue
Publication date: 10/06/2022
Field of study

International Migration, Integration and Social Cohesion online publications

UvA-DARE