3,565 research outputs found
An Exploratory Study of COVID-19 Misinformation on Twitter
During the COVID-19 pandemic, social media has become a home ground for
misinformation. To tackle this infodemic, scientific oversight, as well as a
better understanding by practitioners in crisis management, is needed. We have
conducted an exploratory study into the propagation, authors and content of
misinformation on Twitter around the topic of COVID-19 in order to gain early
insights. We have collected all tweets mentioned in the verdicts of
fact-checked claims related to COVID-19 by over 92 professional fact-checking
organisations between January and mid-July 2020 and share this corpus with the
community. This resulted in 1 500 tweets relating to 1 274 false and 276
partially false claims, respectively. Exploratory analysis of author accounts
revealed that the verified twitter handle(including Organisation/celebrity) are
also involved in either creating (new tweets) or spreading (retweet) the
misinformation. Additionally, we found that false claims propagate faster than
partially false claims. Compare to a background corpus of COVID-19 tweets,
tweets with misinformation are more often concerned with discrediting other
information on social media. Authors use less tentative language and appear to
be more driven by concerns of potential harm to others. Our results enable us
to suggest gaps in the current scientific coverage of the topic as well as
propose actions for authorities and social media users to counter
misinformation.Comment: 20 pages, nine figures, four tables. Submitted for peer review,
revision
Detecting and Tracking the Spread of Astroturf Memes in Microblog Streams
Online social media are complementing and in some cases replacing
person-to-person social interaction and redefining the diffusion of
information. In particular, microblogs have become crucial grounds on which
public relations, marketing, and political battles are fought. We introduce an
extensible framework that will enable the real-time analysis of meme diffusion
in social media by mining, visualizing, mapping, classifying, and modeling
massive streams of public microblogging events. We describe a Web service that
leverages this framework to track political memes in Twitter and help detect
astroturfing, smear campaigns, and other misinformation in the context of U.S.
political elections. We present some cases of abusive behaviors uncovered by
our service. Finally, we discuss promising preliminary results on the detection
of suspicious memes via supervised learning based on features extracted from
the topology of the diffusion networks, sentiment analysis, and crowdsourced
annotations
Satirical News Detection and Analysis using Attention Mechanism and Linguistic Features
Satirical news is considered to be entertainment, but it is potentially
deceptive and harmful. Despite the embedded genre in the article, not everyone
can recognize the satirical cues and therefore believe the news as true news.
We observe that satirical cues are often reflected in certain paragraphs rather
than the whole document. Existing works only consider document-level features
to detect the satire, which could be limited. We consider paragraph-level
linguistic features to unveil the satire by incorporating neural network and
attention mechanism. We investigate the difference between paragraph-level
features and document-level features, and analyze them on a large satirical
news dataset. The evaluation shows that the proposed model detects satirical
news effectively and reveals what features are important at which level.Comment: EMNLP 2017, 11 page
Multilingual Cross-domain Perspectives on Online Hate Speech
In this report, we present a study of eight corpora of online hate speech, by
demonstrating the NLP techniques that we used to collect and analyze the
jihadist, extremist, racist, and sexist content. Analysis of the multilingual
corpora shows that the different contexts share certain characteristics in
their hateful rhetoric. To expose the main features, we have focused on text
classification, text profiling, keyword and collocation extraction, along with
manual annotation and qualitative study.Comment: 24 page
Dataset for multimodal fake news detection and verification tasks
The proliferation of online disinformation and fake news, particularly in the context of breaking news events, demands the development of effective detection mechanisms. While textual content remains the predominant medium for disseminating misleading information, the contribution of other modalities is increasingly emerging within online outlets and social media platforms. However, multimodal datasets, which incorporate diverse modalities such as texts and images, are not very common yet, especially in low-resource languages. This study addresses this gap by releasing a dataset tailored for multimodal fake news detection in the Italian language. This dataset was originally employed in a shared task on the Italian language. The dataset is divided into two data subsets, each corresponding to a distinct sub-task. In sub-task 1, the goal is to assess the effectiveness of multimodal fake news detection systems. Sub-task 2 aims to delve into the interplay between text and images, specifically analyzing how these modalities mutually influence the interpretation of content when distinguishing between fake and real news. Both sub-tasks were managed as classification problems. The dataset consists of social media posts and news articles. After collecting it, it was labeled via crowdsourcing. Annotators were provided with external knowledge about the topic of the news to be labeled, enhancing their ability to discriminate between fake and real news. The data subsets for sub-task 1 and sub-task 2 consist of 913 and 1350 items, respectively, encompassing newspaper articles and tweets
- …