Search CORE

5,255 research outputs found

Multi-Document Summarization via Discriminative Summary Reranking

Author: Cao Ziqiang
Li Sujian
Wan Xiaojun
Wei Furu
Zhou Ming
Publication venue
Publication date: 08/07/2015
Field of study

Existing multi-document summarization systems usually rely on a specific summarization model (i.e., a summarization method with a specific parameter setting) to extract summaries for different document sets with different topics. However, according to our quantitative analysis, none of the existing summarization models can always produce high-quality summaries for different document sets, and even a summarization model with good overall performance may produce low-quality summaries for some document sets. On the contrary, a baseline summarization model may produce high-quality summaries for some document sets. Based on the above observations, we treat the summaries produced by different summarization models as candidate summaries, and then explore discriminative reranking techniques to identify high-quality summaries from the candidates for difference document sets. We propose to extract a set of candidate summaries for each document set based on an ILP framework, and then leverage Ranking SVM for summary reranking. Various useful features have been developed for the reranking process, including word-level features, sentence-level features and summary-level features. Evaluation results on the benchmark DUC datasets validate the efficacy and robustness of our proposed approach

arXiv.org e-Print Archive

CiteSeerX

Generating Aspect-oriented Multi-document Summarization with Event-Aspect Model

Author: GAO Wei
JIANG Jing
LI Peng
WANG Yinglin
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/07/2011
Field of study

In this paper, we propose a novel approach to automatic generation of aspect-oriented summaries from multiple documents. We first develop an event-aspect LDA model to cluster sentences into aspects. We then use extended LexRank algorithm to rank the sentences in each cluster. We use Integer Linear Programming for sentence selection. Key features of our method include automatic grouping of semantically related sentences and sentence ranking based on extension of random walk model. Also, we implement a new sentence compression algorithm which use dependency tree instead of parser tree. We compare our method with four baseline methods. Quantitative evaluation based on Rouge metric demonstrates the effectiveness and advantages of our method.

CiteSeerX

Institutional Knowledge at Singapore Management University

A Multi-task Learning Approach for Improving Product Title Compression with User Search Log Data

Author: Lan Man
Lang Jun
Li Sheng
Qiu Long
Si Luo
Tian Junfeng
Wang Jingang
Publication venue
Publication date: 05/01/2018
Field of study

It is a challenging and practical research problem to obtain effective compression of lengthy product titles for E-commerce. This is particularly important as more and more users browse mobile E-commerce apps and more merchants make the original product titles redundant and lengthy for Search Engine Optimization. Traditional text summarization approaches often require a large amount of preprocessing costs and do not capture the important issue of conversion rate in E-commerce. This paper proposes a novel multi-task learning approach for improving product title compression with user search log data. In particular, a pointer network-based sequence-to-sequence approach is utilized for title compression with an attentive mechanism as an extractive method and an attentive encoder-decoder approach is utilized for generating user search queries. The encoding parameters (i.e., semantic embedding of original titles) are shared among the two tasks and the attention distributions are jointly optimized. An extensive set of experiments with both human annotated data and online deployment demonstrate the advantage of the proposed research for both compression qualities and online business values.Comment: 8 Pages, accepted at AAAI 201

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Graph-based Neural Multi-Document Summarization

Author: Meelu Kshitijh
Pareek Ayush
Radev Dragomir
Srinivasan Krishnan
Yasunaga Michihiro
Zhang Rui
Publication venue
Publication date: 01/01/2017
Field of study

We propose a neural multi-document summarization (MDS) system that incorporates sentence relation graphs. We employ a Graph Convolutional Network (GCN) on the relation graphs, with sentence embeddings obtained from Recurrent Neural Networks as input node features. Through multiple layer-wise propagation, the GCN generates high-level hidden sentence features for salience estimation. We then use a greedy heuristic to extract salient sentences while avoiding redundancy. In our experiments on DUC 2004, we consider three types of sentence relation graphs and demonstrate the advantage of combining sentence relations in graphs with the representation power of deep neural networks. Our model improves upon traditional graph-based extractive approaches and the vanilla GRU sequence model with no graph, and it achieves competitive results against other state-of-the-art multi-document summarization systems.Comment: In CoNLL 201

arXiv.org e-Print Archive

Crossref