Search CORE

10 research outputs found

Data-to-Text Generation with Content Selection and Planning

Author: Dong Li
Lapata Mirella
Puduppully Ratish
Publication venue
Publication date: 12/04/2019
Field of study

Recent advances in data-to-text generation have led to the use of large-scale datasets and neural network models which are trained end-to-end, without explicitly modeling what to say and in what order. In this work, we present a neural network architecture which incorporates content selection and planning without sacrificing end-to-end training. We decompose the generation task into two stages. Given a corpus of data records (paired with descriptive documents), we first generate a content plan highlighting which information should be mentioned and in which order and then generate the document while taking the content plan into account. Automatic and human-based evaluation experiments show that our model outperforms strong baselines improving the state-of-the-art on the recently released RotoWire dataset.Comment: Added link to cod

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Learning to Select, Track, and Generate for Data-to-Text

Author: Aramaki Eiji
Ishigaki Tatsuya
Iso Hayate
Kobayashi Ichiro
Miyao Yusuke
Noji Hiroshi
Okazaki Naoaki
Takamura Hiroya
Uehara Yui
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2019
Field of study

We propose a data-to-text generation model with two modules, one for tracking and the other for text generation. Our tracking module selects and keeps track of salient information and memorizes which record has been mentioned. Our generation module generates a summary conditioned on the state of tracking module. Our model is considered to simulate the human-like writing process that gradually selects the information by determining the intermediate variables while writing the summary. In addition, we also explore the effectiveness of the writer information for generation. Experimental results show that our model outperforms existing models in all evaluation metrics even without writer information. Incorporating writer information further improves the performance, contributing to content planning and surface realization.Comment: ACL 201

arXiv.org e-Print Archive

Crossref

Sentence-Level Content Planning and Style Specification for Neural Text Generation

Author: Hua Xinyu
Wang Lu
Publication venue
Publication date: 01/01/2019
Field of study

Building effective text generation systems requires three critical components: content selection, text planning, and surface realization, and traditionally they are tackled as separate problems. Recent all-in-one style neural generation models have made impressive progress, yet they often produce outputs that are incoherent and unfaithful to the input. To address these issues, we present an end-to-end trained two-step generation model, where a sentence-level content planner first decides on the keyphrases to cover as well as a desired language style, followed by a surface realization decoder that generates relevant and coherent text. For experiments, we consider three tasks from domains with diverse topics and varying language styles: persuasive argument construction from Reddit, paragraph generation for normal and simple versions of Wikipedia, and abstract generation for scientific articles. Automatic evaluation shows that our system can significantly outperform competitive comparisons. Human judges further rate our system generated text as more fluent and correct, compared to the generations by its variants that do not consider language style.Comment: Accepted as a long paper to EMNLP 201

arXiv.org e-Print Archive

Crossref

From Data to Speech:A Generic Approach

Author: de Pijper J.R.
Klabbers E.
Krahmer E.J.
Odijk J.
Theune M.
Publication venue
Publication date: 01/01/2001
Field of study

Tilburg University Repository

Automatic Generation of Sports News

Author: João Pinto Barbosa Machado Aires
Publication venue
Publication date: 08/07/2016
Field of study

Nesta dissertação foi desenvolvido um sistema de geração de linguagem natural, que a partir de dados de um determinado jogo de futebol, é capaz de criar uma notícia com o rescaldo desse jogo, automaticamente

Repositório Aberto da Universidade do Porto

Reactive Content Selection in the Generation of Real-time Soccer Commentary

Author: Itsuki Noda
Kumiko Tanaka-ishii
Kôiti HASIDA
Publication venue
Publication date: 01/01/1998
Field of study

MIKE is an automatic commentary system that generates a commentary of a simulated soccer game in English, French, or Japanese. One of the major technical challenges..

CiteSeerX

Crossref

The role of terminology and local grammar in video annotation

Author: Al-Athel Mohammed S
Publication venue
Publication date: 01/01/2008
Field of study

The linguistic annotation' of video sequences is an intellectually challenging task involving the investigation of how images and words are linked .together, a task that is ultimately financially rewarding in that the eventual automatic retrieval of video (sequences) can be much less time consuming, subjective and expensive than when retrieved manually. Much effort has been focused on automatic or semi-automatic annotation. Computational linguistic methods of video annotation rely on collections of collateral text in the form of keywords and proper nouns. Keywords are often used in a particular order indicating an identifiable pattern which is often limited and can subsequently be used to annotate the portion of a video where such a pattern occurred. Once' the relevant keywords and patterns have been stored, they can then be used to annotate the remainder of the video, excluding all collateral text which does not match the keywords or patterns. A new method of video annotation is presented in this thesis. The method facilitates a) annotation extraction of specialist terms within a corpus of collateral text; b) annotation identification of frequently used linguistic patterns to use in repeating key events within the data-set. The use of the method has led to the development of a system that can automatically assign key words and key patterns to a number of frames that are found in the commentary text approximately contemporaneous to the selected number of frames. The system does not perform video analysis; it only analyses the collateral text. The method is based on corpus linguistics and is mainly frequency based - frequency of occurrence of a key word or key pattern is taken as the basis of its representation. No assumptions are made about the grammatical structure of the language used in the collateral text, neither is a lexica of key words refined. Our system has been designed to annotate videos of football matches in English a!ld Arabic, and also cricket videos in English. The system has also been designed to retrieve annotated clips. The system not only provides a simple search method for annotated clips retrieval, it also provides complex, more advanced search methods.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

Surrey Research Insight

OpenGrey Repository