95 research outputs found

    DivGraphPointer: A Graph Pointer Network for Extracting Diverse Keyphrases

    Full text link
    Keyphrase extraction from documents is useful to a variety of applications such as information retrieval and document summarization. This paper presents an end-to-end method called DivGraphPointer for extracting a set of diversified keyphrases from a document. DivGraphPointer combines the advantages of traditional graph-based ranking methods and recent neural network-based approaches. Specifically, given a document, a word graph is constructed from the document based on word proximity and is encoded with graph convolutional networks, which effectively capture document-level word salience by modeling long-range dependency between words in the document and aggregating multiple appearances of identical words into one node. Furthermore, we propose a diversified point network to generate a set of diverse keyphrases out of the word graph in the decoding process. Experimental results on five benchmark data sets show that our proposed method significantly outperforms the existing state-of-the-art approaches.Comment: Accepted to SIGIR 201

    Hardware Acceleration for Similarity Measurement in Natural Language Processing

    Get PDF
    Abstract-The continuation of Moore's law scaling, but in the absence of Dennard scaling, motivates an emphasis on energyefficient accelerator-based designs for future applications. In natural language processing, the conventional approach to automatically analyze vast text collections-using scale-out processingincurs high energy and hardware costs since the central computeintensive step of similarity measurement often entails pair-wise, allto-all comparisons. We propose a custom hardware accelerator for similarity measures that leverages data streaming, memory latency hiding, and parallel computation across variable-length threads. We evaluate our design through a combination of architectural simulation and RTL synthesis. When executing the dominant kernel in a semantic indexing application for documents, we demonstrate throughput gains of up to 42× and 58× lower energy per similaritycomputation compared to an optimized software implementation, while requiring less than 1.3% of the area of a conventional core

    Finding information about mental health in microblogging platforms: a Case study of depression

    Get PDF
    Searching for online health information has been well studied in web search, but social media, such as public microblogging services, are well known for different types of tacit information: personal experience and shared information. Finding useful information in public microblogging platforms is an on-going hard problem and so to begin to develop a better model of what health information can be found, Twitter posts using the word “depression” were examined as a case study of a search for a prevalent mental health issue. 13,279 public tweets were analysed using a mixed methods approach and compared to a general sample of tweets. First, a linguistic analysis suggested that tweets mentioning depression were typically anxious but not angry, and were less likely to be in the first person, indicating that most were not from individuals discussing their own depression. Second, to un-derstand what types of tweets can be found, an inductive thematic analysis revealed three major themes: 1) dissemi-nating information or link of information, 2) self-disclosing, and 3) the sharing of overall opinion; each had significantly different linguistic patterns. We conclude with a discussion of how different types of posts about mental health may be retrieved from public social media like Twitter

    Visual overviews for discovering key papers and influences across research fronts

    Full text link
    Gaining a rapid overview of an emerging scientific topic, sometimes called research fronts , is an increasingly common task due to the growing amount of interdisciplinary collaboration. Visual overviews that show temporal patterns of paper publication and citation links among papers can help researchers and analysts to see the rate of growth of topics, identify key papers, and understand influences across subdisciplines. This article applies a novel network-visualization tool based on meaningful layouts of nodes to present research fronts and show citation links that indicate influences across research fronts. To demonstrate the value of two-dimensional layouts with multiple regions and user control of link visibility, we conducted a design-oriented, preliminary case study with 6 domain experts over a 4-month period. The main benefits were being able (a) to easily identify key papers and see the increasing number of papers within a research front, and (b) to quickly see the strength and direction of influence across related research fronts.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/64320/1/21160_ftp.pd

    Rumour verification through recurring information and an inner-attention mechanism

    Get PDF
    Verification of online rumours is becoming an increasingly important task with the prevalence of event discussions on social media platforms. This paper proposes an inner-attention-based neural network model that uses frequent, recurring terms from past rumours to classify a newly emerging rumour as true, false or unverified. Unlike other methods proposed in related work, our model uses the source rumour alone without any additional information, such as user replies to the rumour or additional feature engineering. Our method outperforms the current state-of-the-art methods on benchmark datasets (RumourEval2017) by 3% accuracy and 6% F-1 leading to 60.7% accuracy and 61.6% F-1. We also compare our attention-based method to two similar models which however do not make use of recurrent terms. The attention-based method guided by frequent recurring terms outperforms this baseline on the same dataset, indicating that the recurring terms injected by the attention mechanism have high positive impact on distinguishing between true and false rumours. Furthermore, we perform out-of-domain evaluations and show that our model is indeed highly competitive compared to the baselines on a newly released RumourEval2019 dataset and also achieves the best performance on classifying fake and legitimate news headlines

    Stance detection on social media: State of the art and trends

    Get PDF
    Stance detection on social media is an emerging opinion mining paradigm for various social and political applications in which sentiment analysis may be sub-optimal. There has been a growing research interest for developing effective methods for stance detection methods varying among multiple communities including natural language processing, web science, and social computing. This paper surveys the work on stance detection within those communities and situates its usage within current opinion mining techniques in social media. It presents an exhaustive review of stance detection techniques on social media, including the task definition, different types of targets in stance detection, features set used, and various machine learning approaches applied. The survey reports state-of-the-art results on the existing benchmark datasets on stance detection, and discusses the most effective approaches. In addition, this study explores the emerging trends and different applications of stance detection on social media. The study concludes by discussing the gaps in the current existing research and highlights the possible future directions for stance detection on social media.Comment: We request withdrawal of this article sincerely. We will re-edit this paper. Please withdraw this article before we finish the new versio

    Analysing How People Orient to and Spread Rumours in Social Media by Looking at Conversational Threads

    Get PDF
    As breaking news unfolds people increasingly rely on social media to stay abreast of the latest updates. The use of social media in such situations comes with the caveat that new information being released piecemeal may encourage rumours, many of which remain unverified long after their point of release. Little is known, however, about the dynamics of the life cycle of a social media rumour. In this paper we present a methodology that has enabled us to collect, identify and annotate a dataset of 330 rumour threads (4,842 tweets) associated with 9 newsworthy events. We analyse this dataset to understand how users spread, support, or deny rumours that are later proven true or false, by distinguishing two levels of status in a rumour life cycle i.e., before and after its veracity status is resolved. The identification of rumours associated with each event, as well as the tweet that resolved each rumour as true or false, was performed by journalist members of the research team who tracked the events in real time. Our study shows that rumours that are ultimately proven true tend to be resolved faster than those that turn out to be false. Whilst one can readily see users denying rumours once they have been debunked, users appear to be less capable of distinguishing true from false rumours when their veracity remains in question. In fact, we show that the prevalent tendency for users is to support every unverified rumour. We also analyse the role of different types of users, finding that highly reputable users such as news organisations endeavour to post well-grounded statements, which appear to be certain and accompanied by evidence. Nevertheless, these often prove to be unverified pieces of information that give rise to false rumours. Our study reinforces the need for developing robust machine learning techniques that can provide assistance in real time for assessing the veracity of rumours. The findings of our study provide useful insights for achieving this aim
    corecore