752 research outputs found

    Simple open stance classification for rumour analysis

    Get PDF
    Stance classification determines the attitude, or stance, in a (typically short) text. The task has powerful applications, such as the detection of fake news or the automatic extraction of attitudes toward entities or events in the media. This paper describes a surprisingly simple and efficient classification approach to open stance classification in Twitter, for rumour and veracity classification. The approach profits from a novel set of automatically identifiable problem-specific features, which significantly boost classifier accuracy and achieve above state-of-the-art results on recent benchmark datasets. This calls into question the value of using complex sophisticated models for stance classification without first doing informed feature extraction

    QMUL-SDS at CheckThat! 2020: Determining COVID-19 Tweet Check-Worthiness Using an Enhanced CT-BERT with Numeric Expressions

    Full text link
    This paper describes the participation of the QMUL-SDS team for Task 1 of the CLEF 2020 CheckThat! shared task. The purpose of this task is to determine the check-worthiness of tweets about COVID-19 to identify and prioritise tweets that need fact-checking. The overarching aim is to further support ongoing efforts to protect the public from fake news and help people find reliable information. We describe and analyse the results of our submissions. We show that a CNN using COVID-Twitter-BERT (CT-BERT) enhanced with numeric expressions can effectively boost performance from baseline results. We also show results of training data augmentation with rumours on other topics. Our best system ranked fourth in the task with encouraging outcomes showing potential for improved results in the future

    COVID-19 and Arabic Twitter:How can Arab World Governments and Public Health Organizations Learn from Social Media?

    Get PDF
    In March 2020, the World Health Organization announced the COVID-19 outbreak as a pandemic. Most previous social media related research has been on English tweets and COVID-19. In this study, we collect approximately 1 million Arabic tweets from the Twitter streaming API related to COVID-19. Focussing on outcomes that we believe will be useful for Public Health Organizations, we analyse them in three different ways: identifying the topics discussed during the period, detecting rumours, and predicting the source of the tweets. We use the k-means algorithm for the first goal with k=5. The topics discussed can be grouped as follows: COVID-19 statistics, prayers for God, COVID-19 locations, advise and education for prevention, and advertising. We sample 2000 tweets and label them manually for false information, correct information, and unrelated. Then, we apply three different machine learning algorithms, Logistic Regression, Support Vector Classification, and Naïve Bayes with two sets of features, word frequency approach and word embeddings. We find that Machine Learning classifiers are able to correctly identify the rumour related tweets with 84% accuracy. We also try to predict the source of the rumour related tweets depending on our previous model which is about classifying tweets into five categories: academic, media, government, health professional, and public. Around (60%) of the rumour related tweets are classified as written by health professionals and academics

    Rumor Stance Classification in Online Social Networks: A Survey on the State-of-the-Art, Prospects, and Future Challenges

    Full text link
    The emergence of the Internet as a ubiquitous technology has facilitated the rapid evolution of social media as the leading virtual platform for communication, content sharing, and information dissemination. In spite of revolutionizing the way news used to be delivered to people, this technology has also brought along with itself inevitable demerits. One such drawback is the spread of rumors facilitated by social media platforms which may provoke doubt and fear upon people. Therefore, the need to debunk rumors before their wide spread has become essential all the more. Over the years, many studies have been conducted to develop effective rumor verification systems. One aspect of such studies focuses on rumor stance classification, which concerns the task of utilizing users' viewpoints about a rumorous post to better predict the veracity of a rumor. Relying on users' stances in rumor verification task has gained great importance, for it has shown significant improvements in the model performances. In this paper, we conduct a comprehensive literature review on rumor stance classification in complex social networks. In particular, we present a thorough description of the approaches and mark the top performances. Moreover, we introduce multiple datasets available for this purpose and highlight their limitations. Finally, some challenges and future directions are discussed to stimulate further relevant research efforts.Comment: 13 pages, 2 figures, journa

    Will-they-won't-they: A very large dataset for stance detection on twitter

    Get PDF
    We present a new challenging stance detection dataset, called Will-They-Won’t-They (WT--WT), which contains 51,284 tweets in English, making it by far the largest available dataset of the type. All the annotations are carried out by experts; therefore, the dataset constitutes a high-quality and reliable benchmark for future research in stance detection. Our experiments with a wide range of recent state-of-the-art stance detection systems show that the dataset poses a strong challenge to existing models in this domain.Keynes Fund, Cambridg

    Context-Aware Message-Level Rumour Detection with Weak Supervision

    Get PDF
    Social media has become the main source of all sorts of information beyond a communication medium. Its intrinsic nature can allow a continuous and massive flow of misinformation to make a severe impact worldwide. In particular, rumours emerge unexpectedly and spread quickly. It is challenging to track down their origins and stop their propagation. One of the most ideal solutions to this is to identify rumour-mongering messages as early as possible, which is commonly referred to as "Early Rumour Detection (ERD)". This dissertation focuses on researching ERD on social media by exploiting weak supervision and contextual information. Weak supervision is a branch of ML where noisy and less precise sources (e.g. data patterns) are leveraged to learn limited high-quality labelled data (Ratner et al., 2017). This is intended to reduce the cost and increase the efficiency of the hand-labelling of large-scale data. This thesis aims to study whether identifying rumours before they go viral is possible and develop an architecture for ERD at individual post level. To this end, it first explores major bottlenecks of current ERD. It also uncovers a research gap between system design and its applications in the real world, which have received less attention from the research community of ERD. One bottleneck is limited labelled data. Weakly supervised methods to augment limited labelled training data for ERD are introduced. The other bottleneck is enormous amounts of noisy data. A framework unifying burst detection based on temporal signals and burst summarisation is investigated to identify potential rumours (i.e. input to rumour detection models) by filtering out uninformative messages. Finally, a novel method which jointly learns rumour sources and their contexts (i.e. conversational threads) for ERD is proposed. An extensive evaluation setting for ERD systems is also introduced

    Stance Classification for Rumour Analysis in Twitter: Exploiting Affective Information and Conversation Structure

    Get PDF
    Analysing how people react to rumours associated with news in social media is an important task to prevent the spreading of misinformation, which is nowadays widely recognized as a dangerous tendency. In social media conversations, users show different stances and attitudes towards rumourous stories. Some users take a definite stance, supporting or denying the rumour at issue, while others just comment it, or ask for additional evidence related to the veracity of the rumour. On this line, a new shared task has been proposed at SemEval-2017 (Task 8, SubTask A), which is focused on rumour stance classification in English tweets. The goal is predicting user stance towards emerging rumours in Twitter, in terms of supporting, denying, querying, or commenting the original rumour, looking at the conversation threads originated by the rumour. This paper describes a new approach to this task, where the use of conversation-based and affective-based features, covering different facets of affect, has been explored. Our classification model outperforms the best-performing systems for stance classification at SemEval-2017 Task 8, showing the effectiveness of the feature set proposed.Comment: To appear in Proceedings of the 2nd International Workshop on Rumours and Deception in Social Media (RDSM), co-located with CIKM 2018, Turin, Italy, October 201

    STANDER: An expert-annotated dataset for news stance detection and evidence retrieval

    Get PDF
    N/
    • …
    corecore