17 research outputs found

    Tree LSTMs with Convolution Units to Predict Stance and Rumor Veracity in Social Media Conversations

    Get PDF
    Learning from social-media conversations has gained significant attention recently because of its applications in areas like rumor detection. In this research, we propose a new way to represent social-media conversations as binarized constituency trees that allows comparing features in source-posts and their replies effectively. Moreover, we propose to use convolution units in Tree LSTMs that are better at learning patterns in features obtained from the source and reply posts. Our Tree LSTM models employ multi-task (stance + rumor) learning and propagate the useful stance signal up in the tree for rumor classification at the root node. The proposed models achieve state-of-the-art performance, outperforming the current best model by 12% and 15% on F1-macro for rumor-veracity classification and stance classification tasks respectively

    Discourse-aware rumour stance classification in social media using sequential classifiers

    Get PDF
    Rumour stance classification, defined as classifying the stance of specific social media posts into one of supporting, denying, querying or commenting on an earlier post, is becoming of increasing interest to researchers. While most previous work has focused on using individual tweets as classifier inputs, here we report on the performance of sequential classifiers that exploit the discourse features inherent in social media interactions or 'conversational threads'. Testing the effectiveness of four sequential classifiers -- Hawkes Processes, Linear-Chain Conditional Random Fields (Linear CRF), Tree-Structured Conditional Random Fields (Tree CRF) and Long Short Term Memory networks (LSTM) -- on eight datasets associated with breaking news stories, and looking at different types of local and contextual features, our work sheds new light on the development of accurate stance classifiers. We show that sequential classifiers that exploit the use of discourse properties in social media conversations while using only local features, outperform non-sequential classifiers. Furthermore, we show that LSTM using a reduced set of features can outperform the other sequential classifiers; this performance is consistent across datasets and across types of stances. To conclude, our work also analyses the different features under study, identifying those that best help characterise and distinguish between stances, such as supporting tweets being more likely to be accompanied by evidence than denying tweets. We also set forth a number of directions for future research

    Debunking rumors on Twitter with tree transformer

    Get PDF

    Rumor Stance Classification in Online Social Networks: A Survey on the State-of-the-Art, Prospects, and Future Challenges

    Full text link
    The emergence of the Internet as a ubiquitous technology has facilitated the rapid evolution of social media as the leading virtual platform for communication, content sharing, and information dissemination. In spite of revolutionizing the way news used to be delivered to people, this technology has also brought along with itself inevitable demerits. One such drawback is the spread of rumors facilitated by social media platforms which may provoke doubt and fear upon people. Therefore, the need to debunk rumors before their wide spread has become essential all the more. Over the years, many studies have been conducted to develop effective rumor verification systems. One aspect of such studies focuses on rumor stance classification, which concerns the task of utilizing users' viewpoints about a rumorous post to better predict the veracity of a rumor. Relying on users' stances in rumor verification task has gained great importance, for it has shown significant improvements in the model performances. In this paper, we conduct a comprehensive literature review on rumor stance classification in complex social networks. In particular, we present a thorough description of the approaches and mark the top performances. Moreover, we introduce multiple datasets available for this purpose and highlight their limitations. Finally, some challenges and future directions are discussed to stimulate further relevant research efforts.Comment: 13 pages, 2 figures, journa

    Implementing BERT and fine-tuned RobertA to detect AI generated news by ChatGPT

    Full text link
    The abundance of information on social media has increased the necessity of accurate real-time rumour detection. Manual techniques of identifying and verifying fake news generated by AI tools are impracticable and time-consuming given the enormous volume of information generated every day. This has sparked an increase in interest in creating automated systems to find fake news on the Internet. The studies in this research demonstrate that the BERT and RobertA models with fine-tuning had the best success in detecting AI generated news. With a score of 98%, tweaked RobertA in particular showed excellent precision. In conclusion, this study has shown that neural networks can be used to identify bogus news AI generation news created by ChatGPT. The RobertA and BERT models' excellent performance indicates that these models can play a critical role in the fight against misinformation

    Rumour stance and veracity classification in social media conversations

    Get PDF
    Social media platforms are popular as sources of news, often delivering updates faster than traditional news outlets. The absence of verification of the posted information leads to wide proliferation of misinformation. The effects of propagation of such false information can have far-reaching consequences on society. Traditional manual verification by fact-checking professionals is not scalable to the amount of misinformation being spread. Therefore there is a need for an automated verification tool that would assist the process of rumour resolution. In this thesis we address the problem of rumour verification in social media conversations from a machine learning perspective. Rumours that attract a lot of scepticism in the form of questions and denials among the responses are more likely to be proven false later (Zhao et al., 2015). Thus we explore how crowd wisdom in the form of the stance of responses towards a rumour can contribute to an automated rumour verification system. We study the ways of determining the stance of each response in a conversation automatically. We focus on the importance of incorporating conversation structure into stance classification models and also identifying characteristics of supporting, denying, questioning and commenting posts. We follow by proposing several models for rumour veracity classification that incorporate different feature sets, including the stance of the responses, attempting to find the set that would lead to the most accurate models across several datasets. We view the rumour resolution process as a sequence of tasks: rumour detection, tracking, stance classification and, finally, rumour verification. We then study relations between the tasks in the rumour verification pipeline through a joint learning approach, showing its benefits comparing to single-task learning. Finally, we address the issue of transparency of model decisions by incorporating uncertainty estimation methods into rumour verification models. We then conclude and point directions for future research

    Evaluating the generalisability of neural rumour verification models

    Get PDF
    Research on automated social media rumour verification, the task of identifying the veracity of questionable information circulating on social media, has yielded neural models achieving high performance, with accuracy scores that often exceed 90%. However, none of these studies focus on the real-world generalisability of the proposed approaches, that is whether the models perform well on datasets other than those on which they were initially trained and tested. In this work we aim to fill this gap by assessing the generalisability of top performing neural rumour verification models covering a range of different architectures from the perspectives of both topic and temporal robustness. For a more complete evaluation of generalisability, we collect and release COVID-RV, a novel dataset of Twitter conversations revolving around COVID-19 rumours. Unlike other existing COVID-19 datasets, our COVID-RV contains conversations around rumours that follow the format of prominent rumour verification benchmarks, while being different from them in terms of topic and time scale, thus allowing better assessment of the temporal robustness of the models. We evaluate model performance on COVID-RV and three popular rumour verification datasets to understand limitations and advantages of different model architectures, training datasets and evaluation scenarios. We find a dramatic drop in performance when testing models on a different dataset from that used for training. Further, we evaluate the ability of models to generalise in a few-shot learning setup, as well as when word embeddings are updated with the vocabulary of a new, unseen rumour. Drawing upon our experiments we discuss challenges and make recommendations for future research directions in addressing this important problem

    Towards Evaluating Veracity of Textual Statements on the Web

    Get PDF
    The quality of digital information on the web has been disquieting due to the absence of careful checking. Consequently, a large volume of false textual information is being produced and disseminated with misstatements of facts. The potential negative influence on the public, especially in time-sensitive emergencies, is a growing concern. This concern has motivated this thesis to deal with the problem of veracity evaluation. In this thesis, we set out to develop machine learning models for the veracity evaluation of textual claims based on stance and user engagements. Such evaluation is achieved from three aspects: news stance detection engaged user replies in social media and the engagement dynamics. First of all, we study stance detection in the context of online news articles where a claim is predicted to be true if it is supported by the evidential articles. We propose to manifest a hierarchical structure among stance classes: the high-level aims at identifying relatedness, while the low-level aims at classifying, those identified as related, into the other three classes, i.e., agree, disagree, and discuss. This model disentangles the semantic difference of related/unrelated and the other three stances and helps address the class imbalance problem. Beyond news articles, user replies on social media platforms also contain stances and can infer claim veracity. Claims and user replies in social media are usually short and can be ambiguous; to deal with semantic ambiguity, we design a deep latent variable model with a latent distribution to allow multimodal semantic distribution. Also, marginalizing the latent distribution enables the model to be more robust in relatively smalls-sized datasets. Thirdly, we extend the above content-based models by tracking the dynamics of user engagement in misinformation propagation. To capture these dynamics, we formulate user engagements as a dynamic graph and extract its temporal evolution patterns and geometric features based on an attention-modified Temporal Point Process. This allows to forecast the cumulative number of engaged users and can be useful in assessing the threat level of an individual piece of misinformation. The ability to evaluate veracity and forecast the scale growth of engagement networks serves to practically assist the minimization of online false information’s negative impacts
    corecore