134 research outputs found
Abstract Meaning Representation for Multi-Document Summarization
Generating an abstract from a collection of documents is a desirable
capability for many real-world applications. However, abstractive approaches to
multi-document summarization have not been thoroughly investigated. This paper
studies the feasibility of using Abstract Meaning Representation (AMR), a
semantic representation of natural language grounded in linguistic theory, as a
form of content representation. Our approach condenses source documents to a
set of summary graphs following the AMR formalism. The summary graphs are then
transformed to a set of summary sentences in a surface realization step. The
framework is fully data-driven and flexible. Each component can be optimized
independently using small-scale, in-domain training data. We perform
experiments on benchmark summarization datasets and report promising results.
We also describe opportunities and challenges for advancing this line of
research.Comment: 13 page
Controlling the Amount of Verbatim Copying in Abstractive Summarization
An abstract must not change the meaning of the original text. A single most
effective way to achieve that is to increase the amount of copying while still
allowing for text abstraction. Human editors can usually exercise control over
copying, resulting in summaries that are more extractive than abstractive, or
vice versa. However, it remains poorly understood whether modern neural
abstractive summarizers can provide the same flexibility, i.e., learning from
single reference summaries to generate multiple summary hypotheses with varying
degrees of copying. In this paper, we present a neural summarization model
that, by learning from single human abstracts, can produce a broad spectrum of
summaries ranging from purely extractive to highly generative ones. We frame
the task of summarization as language modeling and exploit alternative
mechanisms to generate summary hypotheses. Our method allows for control over
copying during both training and decoding stages of a neural summarization
model. Through extensive experiments we illustrate the significance of our
proposed method on controlling the amount of verbatim copying and achieve
competitive results over strong baselines. Our analysis further reveals
interesting and unobvious facts.Comment: AAAI 2020 (Main Technical Track
Self-Supervised and Controlled Multi-Document Opinion Summarization
We address the problem of unsupervised abstractive summarization of
collections of user generated reviews with self-supervision and control. We
propose a self-supervised setup that considers an individual document as a
target summary for a set of similar documents. This setting makes training
simpler than previous approaches by relying only on standard log-likelihood
loss. We address the problem of hallucinations through the use of control
codes, to steer the generation towards more coherent and relevant
summaries.Finally, we extend the Transformer architecture to allow for multiple
reviews as input. Our benchmarks on two datasets against graph-based and recent
neural abstractive unsupervised models show that our proposed method generates
summaries with a superior quality and relevance.This is confirmed in our human
evaluation which focuses explicitly on the faithfulness of generated summaries
We also provide an ablation study, which shows the importance of the control
setup in controlling hallucinations and achieve high sentiment and topic
alignment of the summaries with the input reviews.Comment: 18 pages including 5 pages appendi
- …