Search CORE

20 research outputs found

Order-Planning Neural Text Generation From Structured Data

Author: Chang Baobao
Li Sujian
Liu Tianyu
Mou Lili
Poupart Pascal
Sha Lei
Sui Zhifang
Publication venue
Publication date: 01/09/2017
Field of study

Generating texts from structured data (e.g., a table) is important for various natural language processing tasks such as question answering and dialog systems. In recent studies, researchers use neural language models and encoder-decoder frameworks for table-to-text generation. However, these neural network-based approaches do not model the order of contents during text generation. When a human writes a summary based on a given table, he or she would probably consider the content order before wording. In a biography, for example, the nationality of a person is typically mentioned before occupation in a biography. In this paper, we propose an order-planning text generation model to capture the relationship between different fields and use such relationship to make the generated text more fluent and smooth. We conducted experiments on the WikiBio dataset and achieve significantly higher performance than previous methods in terms of BLEU, ROUGE, and NIST scores

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

A Knowledge-Grounded Multimodal Search-Based Conversational Agent

Author: Agarwal Shubham
Dusek Ondrej
Konstas Ioannis
Rieser Verena
Publication venue
Publication date: 01/01/2018
Field of study

Multimodal search-based dialogue is a challenging new task: It extends visually grounded question answering systems into multi-turn conversations with access to an external database. We address this new challenge by learning a neural response generation system from the recently released Multimodal Dialogue (MMD) dataset (Saha et al., 2017). We introduce a knowledge-grounded multimodal conversational model where an encoded knowledge base (KB) representation is appended to the decoder input. Our model substantially outperforms strong baselines in terms of text-based similarity measures (over 9 BLEU points, 3 of which are solely due to the use of additional information from the KB

arXiv.org e-Print Archive

Heriot Watt Pure

Crossref

Table-to-text Generation by Structure-aware Seq2seq Learning

Author: Chang Baobao
Liu Tianyu
Sha Lei
Sui Zhifang
Wang Kexiang
Publication venue
Publication date: 27/11/2017
Field of study

Table-to-text generation aims to generate a description for a factual table which can be viewed as a set of field-value records. To encode both the content and the structure of a table, we propose a novel structure-aware seq2seq architecture which consists of field-gating encoder and description generator with dual attention. In the encoding phase, we update the cell memory of the LSTM unit by a field gate and its corresponding field value in order to incorporate field information into table representation. In the decoding phase, dual attention mechanism which contains word level attention and field level attention is proposed to model the semantic relevance between the generated description and the table. We conduct experiments on the \texttt{WIKIBIO} dataset which contains over 700k biographies and corresponding infoboxes from Wikipedia. The attention visualizations and case studies show that our model is capable of generating coherent and informative descriptions based on the comprehensive understanding of both the content and the structure of a table. Automatic evaluations also show our model outperforms the baselines by a great margin. Code for this work is available on https://github.com/tyliupku/wiki2bio.Comment: Accepted by AAAI201

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Ensuring Readability and Data-fidelity using Head-modifier Templates in Deep Type Description Generation

Author: Chen Jiangjie
Feng Suo
Jiang Haiyun
Li Chenguang
Wang Ao
Xiao Yanghua
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2019
Field of study

A type description is a succinct noun compound which helps human and machines to quickly grasp the informative and distinctive information of an entity. Entities in most knowledge graphs (KGs) still lack such descriptions, thus calling for automatic methods to supplement such information. However, existing generative methods either overlook the grammatical structure or make factual mistakes in generated texts. To solve these problems, we propose a head-modifier template-based method to ensure the readability and data fidelity of generated type descriptions. We also propose a new dataset and two automatic metrics for this task. Experiments show that our method improves substantially compared with baselines and achieves state-of-the-art performance on both datasets.Comment: ACL 201

arXiv.org e-Print Archive

Crossref

PaperRobot: Incremental Draft Generation of Scientific Ideas

Author: Bansal Mohit
Huang Lifu
Ji Heng
Jiang Zhiying
Knight Kevin
Luan Yi
Wang Qingyun
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2019
Field of study

We present a PaperRobot who performs as an automatic research assistant by (1) conducting deep understanding of a large collection of human-written papers in a target domain and constructing comprehensive background knowledge graphs (KGs); (2) creating new ideas by predicting links from the background KGs, by combining graph attention and contextual text attention; (3) incrementally writing some key elements of a new paper based on memory-attention networks: from the input title along with predicted related entities to generate a paper abstract, from the abstract to generate conclusion and future work, and finally from future work to generate a title for a follow-on paper. Turing Tests, where a biomedical domain expert is asked to compare a system output and a human-authored string, show PaperRobot generated abstracts, conclusion and future work sections, and new titles are chosen over human-written ones up to 30%, 24% and 12% of the time, respectively.Comment: 12 pages. Accepted by ACL 2019 Code and resource is available at https://github.com/EagleW/PaperRobo

arXiv.org e-Print Archive

Crossref

Text Assisted Insight Ranking Using Context-Aware Memory Network

Author: Huang Wenhao
Luo Liangchen
Tang Yang
Zeng Qi
Publication venue
Publication date: 13/11/2018
Field of study

Extracting valuable facts or informative summaries from multi-dimensional tables, i.e. insight mining, is an important task in data analysis and business intelligence. However, ranking the importance of insights remains a challenging and unexplored task. The main challenge is that explicitly scoring an insight or giving it a rank requires a thorough understanding of the tables and costs a lot of manual efforts, which leads to the lack of available training data for the insight ranking problem. In this paper, we propose an insight ranking model that consists of two parts: A neural ranking model explores the data characteristics, such as the header semantics and the data statistical features, and a memory network model introduces table structure and context information into the ranking process. We also build a dataset with text assistance. Experimental results show that our approach largely improves the ranking precision as reported in multi evaluation metrics.Comment: Accepted to AAAI 201

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Learning to Select Bi-Aspect Information for Document-Scale Text Content Manipulation

Author: Bi Wei
Feng Xiaocheng
Gong Heng
Liu Ting
Liu Xiaojiang
Qin Bing
Sun Yawei
Sun Yibo
Publication venue
Publication date: 24/02/2020
Field of study

In this paper, we focus on a new practical task, document-scale text content manipulation, which is the opposite of text style transfer and aims to preserve text styles while altering the content. In detail, the input is a set of structured records and a reference text for describing another recordset. The output is a summary that accurately describes the partial content in the source recordset with the same writing style of the reference. The task is unsupervised due to lack of parallel data, and is challenging to select suitable records and style words from bi-aspect inputs respectively and generate a high-fidelity long document. To tackle those problems, we first build a dataset based on a basketball game report corpus as our testbed, and present an unsupervised neural model with interactive attention mechanism, which is used for learning the semantic relationship between records and reference texts to achieve better content transfer and better style preservation. In addition, we also explore the effectiveness of the back-translation in our task for constructing some pseudo-training pairs. Empirical results show superiority of our approaches over competitive methods, and the models also yield a new state-of-the-art result on a sentence-level dataset.Comment: accepted by AAAI202

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications