The advent of Large Language Models (LLMs) has made a transformative impact.
However, the potential that LLMs such as ChatGPT can be exploited to generate
misinformation has posed a serious concern to online safety and public trust. A
fundamental research question is: will LLM-generated misinformation cause more
harm than human-written misinformation? We propose to tackle this question from
the perspective of detection difficulty. We first build a taxonomy of
LLM-generated misinformation. Then we categorize and validate the potential
real-world methods for generating misinformation with LLMs. Then, through
extensive empirical investigation, we discover that LLM-generated
misinformation can be harder to detect for humans and detectors compared to
human-written misinformation with the same semantics, which suggests it can
have more deceptive styles and potentially cause more harm. We also discuss the
implications of our discovery on combating misinformation in the age of LLMs
and the countermeasures.Comment: The code, dataset and more resources on LLMs and misinformation will
be released on the project website: https://llm-misinformation.github.io