22,137 research outputs found
Inferring Strategies for Sentence Ordering in Multidocument News Summarization
The problem of organizing information for multidocument summarization so that
the generated summary is coherent has received relatively little attention.
While sentence ordering for single document summarization can be determined
from the ordering of sentences in the input article, this is not the case for
multidocument summarization where summary sentences may be drawn from different
input articles. In this paper, we propose a methodology for studying the
properties of ordering information in the news genre and describe experiments
done on a corpus of multiple acceptable orderings we developed for the task.
Based on these experiments, we implemented a strategy for ordering information
that combines constraints from chronological order of events and topical
relatedness. Evaluation of our augmented algorithm shows a significant
improvement of the ordering over two baseline strategies
Evaluating Centering for Information Ordering Using Corpora
In this article we discuss several metrics of coherence defined using centering theory and investigate the usefulness of such metrics for information ordering in automatic text generation. We estimate empirically which is the most promising metric and how useful this metric is using a general methodology applied on several corpora. Our main result is that the simplest metric (which relies exclusively on NOCB transitions) sets a robust baseline that cannot be outperformed by other metrics which make use of additional centering-based features. This baseline can be used for the development of both text-to-text and concept-to-text generation systems. </jats:p
Graph-based Neural Multi-Document Summarization
We propose a neural multi-document summarization (MDS) system that
incorporates sentence relation graphs. We employ a Graph Convolutional Network
(GCN) on the relation graphs, with sentence embeddings obtained from Recurrent
Neural Networks as input node features. Through multiple layer-wise
propagation, the GCN generates high-level hidden sentence features for salience
estimation. We then use a greedy heuristic to extract salient sentences while
avoiding redundancy. In our experiments on DUC 2004, we consider three types of
sentence relation graphs and demonstrate the advantage of combining sentence
relations in graphs with the representation power of deep neural networks. Our
model improves upon traditional graph-based extractive approaches and the
vanilla GRU sequence model with no graph, and it achieves competitive results
against other state-of-the-art multi-document summarization systems.Comment: In CoNLL 201
- …