20 research outputs found
Toward Extractive Summarization of Online Forum Discussions via Hierarchical Attention Networks
Forum threads are lengthy and rich in content. Concise thread summaries will
benefit both newcomers seeking information and those who participate in the
discussion. Few studies, however, have examined the task of forum thread
summarization. In this work we make the first attempt to adapt the hierarchical
attention networks for thread summarization. The model draws on the recent
development of neural attention mechanisms to build sentence and thread
representations and use them for summarization. Our results indicate that the
proposed approach can outperform a range of competitive baselines. Further, a
redundancy removal step is crucial for achieving outstanding results.Comment: 5 page
Toward Extractive Summarization of Online Forum Discussions via Hierarchical Attention Networks
Forum threads are lengthy and rich in content. Concise thread summaries will
benefit both newcomers seeking information and those who participate in the
discussion. Few studies, however, have examined the task of forum thread
summarization. In this work we make the first attempt to adapt the hierarchical
attention networks for thread summarization. The model draws on the recent
development of neural attention mechanisms to build sentence and thread
representations and use them for summarization. Our results indicate that the
proposed approach can outperform a range of competitive baselines. Further, a
redundancy removal step is crucial for achieving outstanding results.Comment: 5 page
A Novel ILP Framework for Summarizing Content with High Lexical Variety
Summarizing content contributed by individuals can be challenging, because
people make different lexical choices even when describing the same events.
However, there remains a significant need to summarize such content. Examples
include the student responses to post-class reflective questions, product
reviews, and news articles published by different news agencies related to the
same events. High lexical diversity of these documents hinders the system's
ability to effectively identify salient content and reduce summary redundancy.
In this paper, we overcome this issue by introducing an integer linear
programming-based summarization framework. It incorporates a low-rank
approximation to the sentence-word co-occurrence matrix to intrinsically group
semantically-similar lexical items. We conduct extensive experiments on
datasets of student responses, product reviews, and news documents. Our
approach compares favorably to a number of extractive baselines as well as a
neural abstractive summarization system. The paper finally sheds light on when
and why the proposed framework is effective at summarizing content with high
lexical variety.Comment: Accepted for publication in the journal of Natural Language
Engineering, 201
Recurrent Neural Networks Methods For Social Media Challenges: Forum Summarization And Rumor Detection
The overabundance of data on social media has posed several challenges to users. First, information overload becomes a barrier hindering the users to arrive at a conclusion. Summary of lengthy content can thus significantly facilitate them to grasp key ideas; not only does it help save time and energy, but it also contributes to an effective decision-making process. Second, social media platforms also encourage fast information dissemination where each user acts as a social sensor that generates and shares content. Yet without sufficient supervision, such rapid information sharing can lead to widespread rumors and fake news. Inspired by those challenges, this study proposes recurrent neural network-based frameworks to address them. For the first focus, Forum Summarization, our study presents summarization models adapted from the hierarchical attention networks (HAN) to build representations to predict summary sentences. Our findings demonstrated that the proposed frameworks significantly improved the classification performance as evaluated by sentence-level scores and the summary quality as evaluated by ROUGE scores. For the second focus, Rumor Detection, we present an ensemble deep neural network framework to classify input microblogging events, to whether they are valid or contain rumorous information. We proposed that, in addition to texts from microblog posts, the context related to users is also key to achieving performance improvement. The context information is obtained beforehand from each user\u27s historical posts shared or composed about the event. What\u27s more, to address the shortcomings of CNN, we present a deep end-to-end neural architecture that leverages capsule networks together with a hierarchical structure of an event to learn effective representations for rumor detection. Results from extensive experiments conducted on two real-world datasets, Twitter and Weibo, show that the proposed approach can accurately detect events that carry misinformation, outweighing a range of competitive baselines