Search CORE

30,679 research outputs found

Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation

Author: Gatt Albert
Krahmer Emiel
Publication venue
Publication date: 01/01/2017
Field of study

This paper surveys the current state of the art in Natural Language Generation (NLG), defined as the task of generating text or speech from non-linguistic input. A survey of NLG is timely in view of the changes that the field has undergone over the past decade or so, especially in relation to new (usually data-driven) methods, as well as new applications of NLG technology. This survey therefore aims to (a) give an up-to-date synthesis of research on the core tasks in NLG and the architectures adopted in which such tasks are organised; (b) highlight a number of relatively recent research topics that have arisen partly as a result of growing synergies between NLG and other areas of artificial intelligence; (c) draw attention to the challenges in NLG evaluation, relating them to similar challenges faced in other areas of Natural Language Processing, with an emphasis on different evaluation methods and the relationships between them.Comment: Published in Journal of AI Research (JAIR), volume 61, pp 75-170. 118 pages, 8 figures, 1 tabl

arXiv.org e-Print Archive

OAR@UM

Tilburg University Repository

Unified Pragmatic Models for Generating and Following Instructions

Author: Andreas Jacob
Fried Daniel
Klein Dan
Publication venue
Publication date: 01/01/2018
Field of study

We show that explicit pragmatic inference aids in correctly generating and following natural language instructions for complex, sequential tasks. Our pragmatics-enabled models reason about why speakers produce certain instructions, and about how listeners will react upon hearing them. Like previous pragmatic models, we use learned base listener and speaker models to build a pragmatic speaker that uses the base listener to simulate the interpretation of candidate descriptions, and a pragmatic listener that reasons counterfactually about alternative descriptions. We extend these models to tasks with sequential structure. Evaluation of language generation and interpretation shows that pragmatic inference improves state-of-the-art listener models (at correctly interpreting human instructions) and speaker models (at producing instructions correctly interpreted by humans) in diverse settings.Comment: NAACL 2018, camera-ready versio

arXiv.org e-Print Archive

Crossref

Recommended from our members

Hyper-Document structure: maintaining discourse coherence in non-linear documents

Author: Mancini Clara
Scott Donia
Publication venue
Publication date: 01/01/2006
Field of study

The passage from linear text to hypertext poses the challenge of expressing discourse coherence in non-linear text, where linguistic discourse markers no longer work. While hypertext introduces new possibilities for discourse organisation, it also requires the use of new devices which can support the expression of coherence by exploiting the technical characteristics and expressive capabilities of the medium. In this paper we show how in hypertext the notion of abstract document structure encompasses animated graphics as a form of meta-language for discourse construction

Open Research Online (The Open University)

Learning a Recurrent Visual Representation for Image Caption Generation

Author: Chen Xinlei
Zitnick C. Lawrence
Publication venue
Publication date: 20/11/2014
Field of study

In this paper we explore the bi-directional mapping between images and their sentence-based descriptions. We propose learning this mapping using a recurrent neural network. Unlike previous approaches that map both sentences and images to a common embedding, we enable the generation of novel sentences given an image. Using the same model, we can also reconstruct the visual features associated with an image given its visual description. We use a novel recurrent visual memory that automatically learns to remember long-term visual concepts to aid in both sentence generation and visual feature reconstruction. We evaluate our approach on several tasks. These include sentence generation, sentence retrieval and image retrieval. State-of-the-art results are shown for the task of generating novel image descriptions. When compared to human generated captions, our automatically generated captions are preferred by humans over

19.8\%

of the time. Results are better than or comparable to state-of-the-art results on the image and sentence retrieval tasks for methods using similar visual features

arXiv.org e-Print Archive

Crossref

Doubly-Attentive Decoder for Multi-modal Neural Machine Translation

Author: Calixto Iacer
Campbell Nick
Liu Qun
Publication venue
Publication date: 01/01/2017
Field of study

We introduce a Multi-modal Neural Machine Translation model in which a doubly-attentive decoder naturally incorporates spatial visual features obtained using pre-trained convolutional neural networks, bridging the gap between image description and translation. Our decoder learns to attend to source-language words and parts of an image independently by means of two separate attention mechanisms as it generates words in the target language. We find that our model can efficiently exploit not just back-translated in-domain multi-modal data but also large general-domain text-only MT corpora. We also report state-of-the-art results on the Multi30k data set.Comment: 8 pages (11 including references), 2 figure

arXiv.org e-Print Archive

Crossref