796 research outputs found
Talking about other people:An endless range of possibilities
Image description datasets, such as Flickr30K and MS COCO, show a high degree of variation in the ways that crowd-workers talk about the world. Although this gives us a rich and diverse collection of data to work with, it also introduces uncertainty about how the world should be described. This paper shows the extent of this uncertainty in the PEOPLE-domain. We present a taxonomy of different ways to talk about other people. This taxonomy serves as a reference point to think about how other people should be described, and can be used to classify and compute statistics about labels applied to people
Revisiting Challenges in Data-to-Text Generation with Fact Grounding
Data-to-text generation models face challenges in ensuring data fidelity by
referring to the correct input source. To inspire studies in this area, Wiseman
et al. (2017) introduced the RotoWire corpus on generating NBA game summaries
from the box- and line-score tables. However, limited attempts have been made
in this direction and the challenges remain. We observe a prominent bottleneck
in the corpus where only about 60% of the summary contents can be grounded to
the boxscore records. Such information deficiency tends to misguide a
conditioned language model to produce unconditioned random facts and thus leads
to factual hallucinations. In this work, we restore the information balance and
revamp this task to focus on fact-grounded data-to-text generation. We
introduce a purified and larger-scale dataset, RotoWire-FG (Fact-Grounding),
with 50% more data from the year 2017-19 and enriched input tables, hoping to
attract more research focuses in this direction. Moreover, we achieve improved
data fidelity over the state-of-the-art models by integrating a new form of
table reconstruction as an auxiliary task to boost the generation quality.Comment: Best Paper Runner-up at INLG 2019 (12th International Conference on
Natural Language Generation
Meteorologists and Students : A resource for language grounding of geographical descriptors
Publisher PD
Deep Graph Convolutional Encoders for Structured Data to Text Generation
Most previous work on neural text generation from graph-structured data
relies on standard sequence-to-sequence methods. These approaches linearise the
input graph to be fed to a recurrent neural network. In this paper, we propose
an alternative encoder based on graph convolutional networks that directly
exploits the input structure. We report results on two graph-to-sequence
datasets that empirically show the benefits of explicitly encoding the input
graph structure.Comment: INLG 201
- …