409 research outputs found
PASTA: Table-Operations Aware Fact Verification via Sentence-Table Cloze Pre-training
Fact verification has attracted a lot of research attention recently, e.g.,
in journalism, marketing, and policymaking, as misinformation and
disinformation online can sway one's opinion and affect one's actions. While
fact-checking is a hard task in general, in many cases, false statements can be
easily debunked based on analytics over tables with reliable information.
Hence, table-based fact verification has recently emerged as an important and
growing research area. Yet, progress has been limited due to the lack of
datasets that can be used to pre-train language models (LMs) to be aware of
common table operations, such as aggregating a column or comparing tuples. To
bridge this gap, in this paper we introduce PASTA, a novel state-of-the-art
framework for table-based fact verification via pre-training with synthesized
sentence-table cloze questions. In particular, we design six types of common
sentence-table cloze tasks, including Filter, Aggregation, Superlative,
Comparative, Ordinal, and Unique, based on which we synthesize a large corpus
consisting of 1.2 million sentence-table pairs from WikiTables. PASTA uses a
recent pre-trained LM, DeBERTaV3, and further pretrains it on our corpus. Our
experimental results show that PASTA achieves new state-of-the-art performance
on two table-based fact verification benchmarks: TabFact and SEM-TAB-FACTS. In
particular, on the complex set of TabFact, which contains multiple operations,
PASTA largely outperforms the previous state of the art by 4.7 points (85.6%
vs. 80.9%), and the gap between PASTA and human performance on the small
TabFact test set is narrowed to just 1.5 points (90.6% vs. 92.1%).Comment: EMNLP 202
- …