57,894 research outputs found
Abstractive Multi-Document Summarization via Phrase Selection and Merging
We propose an abstraction-based multi-document summarization framework that
can construct new sentences by exploring more fine-grained syntactic units than
sentences, namely, noun/verb phrases. Different from existing abstraction-based
approaches, our method first constructs a pool of concepts and facts
represented by phrases from the input documents. Then new sentences are
generated by selecting and merging informative phrases to maximize the salience
of phrases and meanwhile satisfy the sentence construction constraints. We
employ integer linear optimization for conducting phrase selection and merging
simultaneously in order to achieve the global optimal solution for a summary.
Experimental results on the benchmark data set TAC 2011 show that our framework
outperforms the state-of-the-art models under automated pyramid evaluation
metric, and achieves reasonably well results on manual linguistic quality
evaluation.Comment: 11 pages, 1 figure, accepted as a full paper at ACL 201
SUPERT: Towards New Frontiers in Unsupervised Evaluation Metrics for Multi-Document Summarization
We study unsupervised multi-document summarization evaluation metrics, which
require neither human-written reference summaries nor human annotations (e.g.
preferences, ratings, etc.). We propose SUPERT, which rates the quality of a
summary by measuring its semantic similarity with a pseudo reference summary,
i.e. selected salient sentences from the source documents, using contextualized
embeddings and soft token alignment techniques. Compared to the
state-of-the-art unsupervised evaluation metrics, SUPERT correlates better with
human ratings by 18-39%. Furthermore, we use SUPERT as rewards to guide a
neural-based reinforcement learning summarizer, yielding favorable performance
compared to the state-of-the-art unsupervised summarizers. All source code is
available at https://github.com/yg211/acl20-ref-free-eval.Comment: ACL 202
Text Summarization Techniques: A Brief Survey
In recent years, there has been a explosion in the amount of text data from a
variety of sources. This volume of text is an invaluable source of information
and knowledge which needs to be effectively summarized to be useful. In this
review, the main approaches to automatic text summarization are described. We
review the different processes for summarization and describe the effectiveness
and shortcomings of the different methods.Comment: Some of references format have update
Abstract Meaning Representation for Multi-Document Summarization
Generating an abstract from a collection of documents is a desirable
capability for many real-world applications. However, abstractive approaches to
multi-document summarization have not been thoroughly investigated. This paper
studies the feasibility of using Abstract Meaning Representation (AMR), a
semantic representation of natural language grounded in linguistic theory, as a
form of content representation. Our approach condenses source documents to a
set of summary graphs following the AMR formalism. The summary graphs are then
transformed to a set of summary sentences in a surface realization step. The
framework is fully data-driven and flexible. Each component can be optimized
independently using small-scale, in-domain training data. We perform
experiments on benchmark summarization datasets and report promising results.
We also describe opportunities and challenges for advancing this line of
research.Comment: 13 page
- …