375 research outputs found
Context-Aware Message-Level Rumour Detection with Weak Supervision
Social media has become the main source of all sorts of information beyond a communication medium. Its intrinsic nature can allow a continuous and massive flow of misinformation to make a severe impact worldwide. In particular, rumours emerge unexpectedly and spread quickly. It is challenging to track down their origins and stop their propagation. One of the most ideal solutions to this is to identify rumour-mongering messages as early as possible, which is commonly referred to as "Early Rumour Detection (ERD)". This dissertation focuses on researching ERD on social media by exploiting weak supervision and contextual information. Weak supervision is a branch of ML where noisy and less precise sources (e.g. data patterns) are leveraged to learn limited high-quality labelled data (Ratner et al., 2017). This is intended to reduce the cost and increase the efficiency of the hand-labelling of large-scale data. This thesis aims to study whether identifying rumours before they go viral is possible and develop an architecture for ERD at individual post level. To this end, it first explores major bottlenecks of current ERD. It also uncovers a research gap between system design and its applications in the real world, which have received less attention from the research community of ERD. One bottleneck is limited labelled data. Weakly supervised methods to augment limited labelled training data for ERD are introduced. The other bottleneck is enormous amounts of noisy data. A framework unifying burst detection based on temporal signals and burst summarisation is investigated to identify potential rumours (i.e. input to rumour detection models) by filtering out uninformative messages. Finally, a novel method which jointly learns rumour sources and their contexts (i.e. conversational threads) for ERD is proposed. An extensive evaluation setting for ERD systems is also introduced
Rumor Stance Classification in Online Social Networks: A Survey on the State-of-the-Art, Prospects, and Future Challenges
The emergence of the Internet as a ubiquitous technology has facilitated the
rapid evolution of social media as the leading virtual platform for
communication, content sharing, and information dissemination. In spite of
revolutionizing the way news used to be delivered to people, this technology
has also brought along with itself inevitable demerits. One such drawback is
the spread of rumors facilitated by social media platforms which may provoke
doubt and fear upon people. Therefore, the need to debunk rumors before their
wide spread has become essential all the more. Over the years, many studies
have been conducted to develop effective rumor verification systems. One aspect
of such studies focuses on rumor stance classification, which concerns the task
of utilizing users' viewpoints about a rumorous post to better predict the
veracity of a rumor. Relying on users' stances in rumor verification task has
gained great importance, for it has shown significant improvements in the model
performances. In this paper, we conduct a comprehensive literature review on
rumor stance classification in complex social networks. In particular, we
present a thorough description of the approaches and mark the top performances.
Moreover, we introduce multiple datasets available for this purpose and
highlight their limitations. Finally, some challenges and future directions are
discussed to stimulate further relevant research efforts.Comment: 13 pages, 2 figures, journa
Rumor Detection on Social Media: Datasets, Methods and Opportunities
Social media platforms have been used for information and news gathering, and
they are very valuable in many applications. However, they also lead to the
spreading of rumors and fake news. Many efforts have been taken to detect and
debunk rumors on social media by analyzing their content and social context
using machine learning techniques. This paper gives an overview of the recent
studies in the rumor detection field. It provides a comprehensive list of
datasets used for rumor detection, and reviews the important studies based on
what types of information they exploit and the approaches they take. And more
importantly, we also present several new directions for future research.Comment: 10 page
Fake News Detection Through Graph-based Neural Networks: A Survey
The popularity of online social networks has enabled rapid dissemination of
information. People now can share and consume information much more rapidly
than ever before. However, low-quality and/or accidentally/deliberately fake
information can also spread rapidly. This can lead to considerable and negative
impacts on society. Identifying, labelling and debunking online misinformation
as early as possible has become an increasingly urgent problem. Many methods
have been proposed to detect fake news including many deep learning and
graph-based approaches. In recent years, graph-based methods have yielded
strong results, as they can closely model the social context and propagation
process of online news. In this paper, we present a systematic review of fake
news detection studies based on graph-based and deep learning-based techniques.
We classify existing graph-based methods into knowledge-driven methods,
propagation-based methods, and heterogeneous social context-based methods,
depending on how a graph structure is constructed to model news related
information flows. We further discuss the challenges and open problems in
graph-based fake news detection and identify future research directions.Comment: 18 pages, 3 tables, 7 figure
Towards Evaluating Veracity of Textual Statements on the Web
The quality of digital information on the web has been disquieting due to the absence of careful checking. Consequently, a large volume of false textual information is being produced and disseminated with misstatements of facts. The potential negative influence on the public, especially in time-sensitive emergencies, is a growing concern. This concern has motivated this thesis to deal with the problem of veracity evaluation. In this thesis, we set out to develop machine learning models for the veracity evaluation of textual claims based on stance and user engagements. Such evaluation is achieved from three aspects: news stance detection engaged user replies in social media and the engagement dynamics. First of all, we study stance detection in the context of online news articles where a claim is predicted to be true if it is supported by the evidential articles. We propose to manifest a hierarchical structure among stance classes: the high-level aims at identifying relatedness, while the low-level aims at classifying, those identified as related, into the other three classes, i.e., agree, disagree, and discuss. This model disentangles the semantic difference of related/unrelated and the other three stances and helps address the class imbalance problem. Beyond news articles, user replies on social media platforms also contain stances and can infer claim veracity. Claims and user replies in social media are usually short and can be ambiguous; to deal with semantic ambiguity, we design a deep latent variable model with a latent distribution to allow multimodal semantic distribution. Also, marginalizing the latent distribution enables the model to be more robust in relatively smalls-sized datasets. Thirdly, we extend the above content-based models by tracking the dynamics of user engagement in misinformation propagation. To capture these dynamics, we formulate user engagements as a dynamic graph and extract its temporal evolution patterns and geometric features based on an attention-modified Temporal Point Process. This allows to forecast the cumulative number of engaged users and can be useful in assessing the threat level of an individual piece of misinformation. The ability to evaluate veracity and forecast the scale growth of engagement networks serves to practically assist the minimization of online false information’s negative impacts
A Weakly Supervised Propagation Model for Rumor Verification and Stance Detection with Multiple Instance Learning
The diffusion of rumors on microblogs generally follows a propagation tree
structure, that provides valuable clues on how an original message is
transmitted and responded by users over time. Recent studies reveal that rumor
detection and stance detection are two different but relevant tasks which can
jointly enhance each other, e.g., rumors can be debunked by cross-checking the
stances conveyed by their relevant microblog posts, and stances are also
conditioned on the nature of the rumor. However, most stance detection methods
require enormous post-level stance labels for training, which are
labor-intensive given a large number of posts. Enlightened by Multiple Instance
Learning (MIL) scheme, we first represent the diffusion of claims with
bottom-up and top-down trees, then propose two tree-structured weakly
supervised frameworks to jointly classify rumors and stances, where only the
bag-level labels concerning claim's veracity are needed. Specifically, we
convert the multi-class problem into a multiple MIL-based binary classification
problem where each binary model focuses on differentiating a target stance or
rumor type and other types. Finally, we propose a hierarchical attention
mechanism to aggregate the binary predictions, including (1) a bottom-up or
top-down tree attention layer to aggregate binary stances into binary veracity;
and (2) a discriminative attention layer to aggregate the binary class into
finer-grained classes. Extensive experiments conducted on three Twitter-based
datasets demonstrate promising performance of our model on both claim-level
rumor detection and post-level stance classification compared with
state-of-the-art methods.Comment: Accepted by SIGIR 202
- …