2,021 research outputs found
Comprehensive Review of Opinion Summarization
The abundance of opinions on the web has kindled the study of opinion summarization over the last few years. People have introduced various techniques and paradigms to solving this special task. This survey attempts to systematically investigate the different techniques and approaches used in opinion summarization. We provide a multi-perspective classification of the approaches used and highlight some of the key weaknesses of these approaches. This survey also covers evaluation techniques and data sets used in studying the opinion summarization problem. Finally, we provide insights into some of the challenges that are left to be addressed as this will help set the trend for future research in this area.unpublishednot peer reviewe
A Novel ILP Framework for Summarizing Content with High Lexical Variety
Summarizing content contributed by individuals can be challenging, because
people make different lexical choices even when describing the same events.
However, there remains a significant need to summarize such content. Examples
include the student responses to post-class reflective questions, product
reviews, and news articles published by different news agencies related to the
same events. High lexical diversity of these documents hinders the system's
ability to effectively identify salient content and reduce summary redundancy.
In this paper, we overcome this issue by introducing an integer linear
programming-based summarization framework. It incorporates a low-rank
approximation to the sentence-word co-occurrence matrix to intrinsically group
semantically-similar lexical items. We conduct extensive experiments on
datasets of student responses, product reviews, and news documents. Our
approach compares favorably to a number of extractive baselines as well as a
neural abstractive summarization system. The paper finally sheds light on when
and why the proposed framework is effective at summarizing content with high
lexical variety.Comment: Accepted for publication in the journal of Natural Language
Engineering, 201
Natural language processing
Beginning with the basic issues of NLP, this chapter aims to chart the major research activities in this area since the last ARIST Chapter in 1996 (Haas, 1996), including: (i) natural language text processing systems - text summarization, information extraction, information retrieval, etc., including domain-specific applications; (ii) natural language interfaces; (iii) NLP in the context of www and digital libraries ; and (iv) evaluation of NLP systems
Better Document-level Sentiment Analysis from RST Discourse Parsing
Discourse structure is the hidden link between surface features and
document-level properties, such as sentiment polarity. We show that the
discourse analyses produced by Rhetorical Structure Theory (RST) parsers can
improve document-level sentiment analysis, via composition of local information
up the discourse tree. First, we show that reweighting discourse units
according to their position in a dependency representation of the rhetorical
structure can yield substantial improvements on lexicon-based sentiment
analysis. Next, we present a recursive neural network over the RST structure,
which offers significant improvements over classification-based methods.Comment: Published at Empirical Methods in Natural Language Processing (EMNLP
2015
- …