20 research outputs found
Order-Planning Neural Text Generation From Structured Data
Generating texts from structured data (e.g., a table) is important for
various natural language processing tasks such as question answering and dialog
systems. In recent studies, researchers use neural language models and
encoder-decoder frameworks for table-to-text generation. However, these neural
network-based approaches do not model the order of contents during text
generation. When a human writes a summary based on a given table, he or she
would probably consider the content order before wording. In a biography, for
example, the nationality of a person is typically mentioned before occupation
in a biography. In this paper, we propose an order-planning text generation
model to capture the relationship between different fields and use such
relationship to make the generated text more fluent and smooth. We conducted
experiments on the WikiBio dataset and achieve significantly higher performance
than previous methods in terms of BLEU, ROUGE, and NIST scores
A Knowledge-Grounded Multimodal Search-Based Conversational Agent
Multimodal search-based dialogue is a challenging new task: It extends
visually grounded question answering systems into multi-turn conversations with
access to an external database. We address this new challenge by learning a
neural response generation system from the recently released Multimodal
Dialogue (MMD) dataset (Saha et al., 2017). We introduce a knowledge-grounded
multimodal conversational model where an encoded knowledge base (KB)
representation is appended to the decoder input. Our model substantially
outperforms strong baselines in terms of text-based similarity measures (over 9
BLEU points, 3 of which are solely due to the use of additional information
from the KB
Table-to-text Generation by Structure-aware Seq2seq Learning
Table-to-text generation aims to generate a description for a factual table
which can be viewed as a set of field-value records. To encode both the content
and the structure of a table, we propose a novel structure-aware seq2seq
architecture which consists of field-gating encoder and description generator
with dual attention. In the encoding phase, we update the cell memory of the
LSTM unit by a field gate and its corresponding field value in order to
incorporate field information into table representation. In the decoding phase,
dual attention mechanism which contains word level attention and field level
attention is proposed to model the semantic relevance between the generated
description and the table. We conduct experiments on the \texttt{WIKIBIO}
dataset which contains over 700k biographies and corresponding infoboxes from
Wikipedia. The attention visualizations and case studies show that our model is
capable of generating coherent and informative descriptions based on the
comprehensive understanding of both the content and the structure of a table.
Automatic evaluations also show our model outperforms the baselines by a great
margin. Code for this work is available on
https://github.com/tyliupku/wiki2bio.Comment: Accepted by AAAI201
Ensuring Readability and Data-fidelity using Head-modifier Templates in Deep Type Description Generation
A type description is a succinct noun compound which helps human and machines
to quickly grasp the informative and distinctive information of an entity.
Entities in most knowledge graphs (KGs) still lack such descriptions, thus
calling for automatic methods to supplement such information. However, existing
generative methods either overlook the grammatical structure or make factual
mistakes in generated texts. To solve these problems, we propose a
head-modifier template-based method to ensure the readability and data fidelity
of generated type descriptions. We also propose a new dataset and two automatic
metrics for this task. Experiments show that our method improves substantially
compared with baselines and achieves state-of-the-art performance on both
datasets.Comment: ACL 201
PaperRobot: Incremental Draft Generation of Scientific Ideas
We present a PaperRobot who performs as an automatic research assistant by
(1) conducting deep understanding of a large collection of human-written papers
in a target domain and constructing comprehensive background knowledge graphs
(KGs); (2) creating new ideas by predicting links from the background KGs, by
combining graph attention and contextual text attention; (3) incrementally
writing some key elements of a new paper based on memory-attention networks:
from the input title along with predicted related entities to generate a paper
abstract, from the abstract to generate conclusion and future work, and finally
from future work to generate a title for a follow-on paper. Turing Tests, where
a biomedical domain expert is asked to compare a system output and a
human-authored string, show PaperRobot generated abstracts, conclusion and
future work sections, and new titles are chosen over human-written ones up to
30%, 24% and 12% of the time, respectively.Comment: 12 pages. Accepted by ACL 2019 Code and resource is available at
https://github.com/EagleW/PaperRobo
Text Assisted Insight Ranking Using Context-Aware Memory Network
Extracting valuable facts or informative summaries from multi-dimensional
tables, i.e. insight mining, is an important task in data analysis and business
intelligence. However, ranking the importance of insights remains a challenging
and unexplored task. The main challenge is that explicitly scoring an insight
or giving it a rank requires a thorough understanding of the tables and costs a
lot of manual efforts, which leads to the lack of available training data for
the insight ranking problem. In this paper, we propose an insight ranking model
that consists of two parts: A neural ranking model explores the data
characteristics, such as the header semantics and the data statistical
features, and a memory network model introduces table structure and context
information into the ranking process. We also build a dataset with text
assistance. Experimental results show that our approach largely improves the
ranking precision as reported in multi evaluation metrics.Comment: Accepted to AAAI 201
Learning to Select Bi-Aspect Information for Document-Scale Text Content Manipulation
In this paper, we focus on a new practical task, document-scale text content
manipulation, which is the opposite of text style transfer and aims to preserve
text styles while altering the content. In detail, the input is a set of
structured records and a reference text for describing another recordset. The
output is a summary that accurately describes the partial content in the source
recordset with the same writing style of the reference. The task is
unsupervised due to lack of parallel data, and is challenging to select
suitable records and style words from bi-aspect inputs respectively and
generate a high-fidelity long document. To tackle those problems, we first
build a dataset based on a basketball game report corpus as our testbed, and
present an unsupervised neural model with interactive attention mechanism,
which is used for learning the semantic relationship between records and
reference texts to achieve better content transfer and better style
preservation. In addition, we also explore the effectiveness of the
back-translation in our task for constructing some pseudo-training pairs.
Empirical results show superiority of our approaches over competitive methods,
and the models also yield a new state-of-the-art result on a sentence-level
dataset.Comment: accepted by AAAI202