20 research outputs found

    Toward Extractive Summarization of Online Forum Discussions via Hierarchical Attention Networks

    Full text link
    Forum threads are lengthy and rich in content. Concise thread summaries will benefit both newcomers seeking information and those who participate in the discussion. Few studies, however, have examined the task of forum thread summarization. In this work we make the first attempt to adapt the hierarchical attention networks for thread summarization. The model draws on the recent development of neural attention mechanisms to build sentence and thread representations and use them for summarization. Our results indicate that the proposed approach can outperform a range of competitive baselines. Further, a redundancy removal step is crucial for achieving outstanding results.Comment: 5 page

    Toward Extractive Summarization of Online Forum Discussions via Hierarchical Attention Networks

    Get PDF
    Forum threads are lengthy and rich in content. Concise thread summaries will benefit both newcomers seeking information and those who participate in the discussion. Few studies, however, have examined the task of forum thread summarization. In this work we make the first attempt to adapt the hierarchical attention networks for thread summarization. The model draws on the recent development of neural attention mechanisms to build sentence and thread representations and use them for summarization. Our results indicate that the proposed approach can outperform a range of competitive baselines. Further, a redundancy removal step is crucial for achieving outstanding results.Comment: 5 page

    A Novel ILP Framework for Summarizing Content with High Lexical Variety

    Full text link
    Summarizing content contributed by individuals can be challenging, because people make different lexical choices even when describing the same events. However, there remains a significant need to summarize such content. Examples include the student responses to post-class reflective questions, product reviews, and news articles published by different news agencies related to the same events. High lexical diversity of these documents hinders the system's ability to effectively identify salient content and reduce summary redundancy. In this paper, we overcome this issue by introducing an integer linear programming-based summarization framework. It incorporates a low-rank approximation to the sentence-word co-occurrence matrix to intrinsically group semantically-similar lexical items. We conduct extensive experiments on datasets of student responses, product reviews, and news documents. Our approach compares favorably to a number of extractive baselines as well as a neural abstractive summarization system. The paper finally sheds light on when and why the proposed framework is effective at summarizing content with high lexical variety.Comment: Accepted for publication in the journal of Natural Language Engineering, 201

    Recurrent Neural Networks Methods For Social Media Challenges: Forum Summarization And Rumor Detection

    No full text
    The overabundance of data on social media has posed several challenges to users. First, information overload becomes a barrier hindering the users to arrive at a conclusion. Summary of lengthy content can thus significantly facilitate them to grasp key ideas; not only does it help save time and energy, but it also contributes to an effective decision-making process. Second, social media platforms also encourage fast information dissemination where each user acts as a social sensor that generates and shares content. Yet without sufficient supervision, such rapid information sharing can lead to widespread rumors and fake news. Inspired by those challenges, this study proposes recurrent neural network-based frameworks to address them. For the first focus, Forum Summarization, our study presents summarization models adapted from the hierarchical attention networks (HAN) to build representations to predict summary sentences. Our findings demonstrated that the proposed frameworks significantly improved the classification performance as evaluated by sentence-level scores and the summary quality as evaluated by ROUGE scores. For the second focus, Rumor Detection, we present an ensemble deep neural network framework to classify input microblogging events, to whether they are valid or contain rumorous information. We proposed that, in addition to texts from microblog posts, the context related to users is also key to achieving performance improvement. The context information is obtained beforehand from each user\u27s historical posts shared or composed about the event. What\u27s more, to address the shortcomings of CNN, we present a deep end-to-end neural architecture that leverages capsule networks together with a hierarchical structure of an event to learn effective representations for rumor detection. Results from extensive experiments conducted on two real-world datasets, Twitter and Weibo, show that the proposed approach can accurately detect events that carry misinformation, outweighing a range of competitive baselines
    corecore