7,902 research outputs found
Tree-Structured Neural Machine for Linguistics-Aware Sentence Generation
Different from other sequential data, sentences in natural language are
structured by linguistic grammars. Previous generative conversational models
with chain-structured decoder ignore this structure in human language and might
generate plausible responses with less satisfactory relevance and fluency. In
this study, we aim to incorporate the results from linguistic analysis into the
process of sentence generation for high-quality conversation generation.
Specifically, we use a dependency parser to transform each response sentence
into a dependency tree and construct a training corpus of sentence-tree pairs.
A tree-structured decoder is developed to learn the mapping from a sentence to
its tree, where different types of hidden states are used to depict the local
dependencies from an internal tree node to its children. For training
acceleration, we propose a tree canonicalization method, which transforms trees
into equivalent ternary trees. Then, with a proposed tree-structured search
method, the model is able to generate the most probable responses in the form
of dependency trees, which are finally flattened into sequences as the system
output. Experimental results demonstrate that the proposed X2Tree framework
outperforms baseline methods over 11.15% increase of acceptance ratio
Table-to-text Generation by Structure-aware Seq2seq Learning
Table-to-text generation aims to generate a description for a factual table
which can be viewed as a set of field-value records. To encode both the content
and the structure of a table, we propose a novel structure-aware seq2seq
architecture which consists of field-gating encoder and description generator
with dual attention. In the encoding phase, we update the cell memory of the
LSTM unit by a field gate and its corresponding field value in order to
incorporate field information into table representation. In the decoding phase,
dual attention mechanism which contains word level attention and field level
attention is proposed to model the semantic relevance between the generated
description and the table. We conduct experiments on the \texttt{WIKIBIO}
dataset which contains over 700k biographies and corresponding infoboxes from
Wikipedia. The attention visualizations and case studies show that our model is
capable of generating coherent and informative descriptions based on the
comprehensive understanding of both the content and the structure of a table.
Automatic evaluations also show our model outperforms the baselines by a great
margin. Code for this work is available on
https://github.com/tyliupku/wiki2bio.Comment: Accepted by AAAI201
Abstract Meaning Representation for Multi-Document Summarization
Generating an abstract from a collection of documents is a desirable
capability for many real-world applications. However, abstractive approaches to
multi-document summarization have not been thoroughly investigated. This paper
studies the feasibility of using Abstract Meaning Representation (AMR), a
semantic representation of natural language grounded in linguistic theory, as a
form of content representation. Our approach condenses source documents to a
set of summary graphs following the AMR formalism. The summary graphs are then
transformed to a set of summary sentences in a surface realization step. The
framework is fully data-driven and flexible. Each component can be optimized
independently using small-scale, in-domain training data. We perform
experiments on benchmark summarization datasets and report promising results.
We also describe opportunities and challenges for advancing this line of
research.Comment: 13 page
- …