With the rise of social media, users are exposed to many misleading claims.
However, the pervasive noise inherent in these posts presents a challenge in
identifying precise and prominent claims that require verification. Extracting
the important claims from such posts is arduous and time-consuming, yet it is
an underexplored problem. Here, we aim to bridge this gap. We introduce a novel
task, Claim Normalization (aka ClaimNorm), which aims to decompose complex and
noisy social media posts into more straightforward and understandable forms,
termed normalized claims. We propose CACN, a pioneering approach that leverages
chain-of-thought and claim check-worthiness estimation, mimicking human
reasoning processes, to comprehend intricate claims. Moreover, we capitalize on
the in-context learning capabilities of large language models to provide
guidance and to improve claim normalization. To evaluate the effectiveness of
our proposed model, we meticulously compile a comprehensive real-world dataset,
CLAN, comprising more than 6k instances of social media posts alongside their
respective normalized claims. Our experiments demonstrate that CACN outperforms
several baselines across various evaluation measures. Finally, our rigorous
error analysis validates CACN's capabilities and pitfalls.Comment: Accepted at Findings EMNLP202