16 research outputs found
Towards Faithful Neural Table-to-Text Generation with Content-Matching Constraints
Text generation from a knowledge base aims to translate knowledge triples to
natural language descriptions. Most existing methods ignore the faithfulness
between a generated text description and the original table, leading to
generated information that goes beyond the content of the table. In this paper,
for the first time, we propose a novel Transformer-based generation framework
to achieve the goal. The core techniques in our method to enforce faithfulness
include a new table-text optimal-transport matching loss and a table-text
embedding similarity loss based on the Transformer model. Furthermore, to
evaluate faithfulness, we propose a new automatic metric specialized to the
table-to-text generation problem. We also provide detailed analysis on each
component of our model in our experiments. Automatic and human evaluations show
that our framework can significantly outperform state-of-the-art by a large
margin.Comment: Accepted at ACL202
Revisiting Challenges in Data-to-Text Generation with Fact Grounding
Data-to-text generation models face challenges in ensuring data fidelity by
referring to the correct input source. To inspire studies in this area, Wiseman
et al. (2017) introduced the RotoWire corpus on generating NBA game summaries
from the box- and line-score tables. However, limited attempts have been made
in this direction and the challenges remain. We observe a prominent bottleneck
in the corpus where only about 60% of the summary contents can be grounded to
the boxscore records. Such information deficiency tends to misguide a
conditioned language model to produce unconditioned random facts and thus leads
to factual hallucinations. In this work, we restore the information balance and
revamp this task to focus on fact-grounded data-to-text generation. We
introduce a purified and larger-scale dataset, RotoWire-FG (Fact-Grounding),
with 50% more data from the year 2017-19 and enriched input tables, hoping to
attract more research focuses in this direction. Moreover, we achieve improved
data fidelity over the state-of-the-art models by integrating a new form of
table reconstruction as an auxiliary task to boost the generation quality.Comment: Best Paper Runner-up at INLG 2019 (12th International Conference on
Natural Language Generation