640 research outputs found
Comprehensive Review of Opinion Summarization
The abundance of opinions on the web has kindled the study of opinion summarization over the last few years. People have introduced various techniques and paradigms to solving this special task. This survey attempts to systematically investigate the different techniques and approaches used in opinion summarization. We provide a multi-perspective classification of the approaches used and highlight some of the key weaknesses of these approaches. This survey also covers evaluation techniques and data sets used in studying the opinion summarization problem. Finally, we provide insights into some of the challenges that are left to be addressed as this will help set the trend for future research in this area.unpublishednot peer reviewe
Graph-based Neural Multi-Document Summarization
We propose a neural multi-document summarization (MDS) system that
incorporates sentence relation graphs. We employ a Graph Convolutional Network
(GCN) on the relation graphs, with sentence embeddings obtained from Recurrent
Neural Networks as input node features. Through multiple layer-wise
propagation, the GCN generates high-level hidden sentence features for salience
estimation. We then use a greedy heuristic to extract salient sentences while
avoiding redundancy. In our experiments on DUC 2004, we consider three types of
sentence relation graphs and demonstrate the advantage of combining sentence
relations in graphs with the representation power of deep neural networks. Our
model improves upon traditional graph-based extractive approaches and the
vanilla GRU sequence model with no graph, and it achieves competitive results
against other state-of-the-art multi-document summarization systems.Comment: In CoNLL 201
Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation
Peer reviewe
Cross-lingual C*ST*RD: English access to Hindi information
We present C*ST*RD, a cross-language information delivery system that supports cross-language information retrieval, information space visualization and navigation, machine translation, and text summarization of single documents and clusters of documents. C*ST*RD was assembled and trained within 1 month, in the context of DARPA’s Surprise Language Exercise, that selected as source a heretofore unstudied language, Hindi. Given the brief time, we could not create deep Hindi capabilities for all the modules, but instead experimented with combining shallow Hindi capabilities, or even English-only modules, into one integrated system. Various possible configurations, with different tradeoffs in processing speed and ease of use, enable the rapid deployment of C*ST*RD to new languages under various conditions
- …