4,659 research outputs found
Adapting the Neural Encoder-Decoder Framework from Single to Multi-Document Summarization
Generating a text abstract from a set of documents remains a challenging
task. The neural encoder-decoder framework has recently been exploited to
summarize single documents, but its success can in part be attributed to the
availability of large parallel data automatically acquired from the Web. In
contrast, parallel data for multi-document summarization are scarce and costly
to obtain. There is a pressing need to adapt an encoder-decoder model trained
on single-document summarization data to work with multiple-document input. In
this paper, we present an initial investigation into a novel adaptation method.
It exploits the maximal marginal relevance method to select representative
sentences from multi-document input, and leverages an abstractive
encoder-decoder model to fuse disparate sentences to an abstractive summary.
The adaptation method is robust and itself requires no training data. Our
system compares favorably to state-of-the-art extractive and abstractive
approaches judged by automatic metrics and human assessors.Comment: 11 page
Self-Supervised and Controlled Multi-Document Opinion Summarization
We address the problem of unsupervised abstractive summarization of
collections of user generated reviews with self-supervision and control. We
propose a self-supervised setup that considers an individual document as a
target summary for a set of similar documents. This setting makes training
simpler than previous approaches by relying only on standard log-likelihood
loss. We address the problem of hallucinations through the use of control
codes, to steer the generation towards more coherent and relevant
summaries.Finally, we extend the Transformer architecture to allow for multiple
reviews as input. Our benchmarks on two datasets against graph-based and recent
neural abstractive unsupervised models show that our proposed method generates
summaries with a superior quality and relevance.This is confirmed in our human
evaluation which focuses explicitly on the faithfulness of generated summaries
We also provide an ablation study, which shows the importance of the control
setup in controlling hallucinations and achieve high sentiment and topic
alignment of the summaries with the input reviews.Comment: 18 pages including 5 pages appendi
- …