2,518 research outputs found
A Retrospective Analysis of the Fake News Challenge Stance Detection Task
The 2017 Fake News Challenge Stage 1 (FNC-1) shared task addressed a stance
classification task as a crucial first step towards detecting fake news. To
date, there is no in-depth analysis paper to critically discuss FNC-1's
experimental setup, reproduce the results, and draw conclusions for
next-generation stance classification methods. In this paper, we provide such
an in-depth analysis for the three top-performing systems. We first find that
FNC-1's proposed evaluation metric favors the majority class, which can be
easily classified, and thus overestimates the true discriminative power of the
methods. Therefore, we propose a new F1-based metric yielding a changed system
ranking. Next, we compare the features and architectures used, which leads to a
novel feature-rich stacked LSTM model that performs on par with the best
systems, but is superior in predicting minority classes. To understand the
methods' ability to generalize, we derive a new dataset and perform both
in-domain and cross-domain experiments. Our qualitative and quantitative study
helps interpreting the original FNC-1 scores and understand which features help
improving performance and why. Our new dataset and all source code used during
the reproduction study are publicly available for future research
Team QCRI-MIT at SemEval-2019 Task 4: Propaganda Analysis Meets Hyperpartisan News Detection
In this paper, we describe our submission to SemEval-2019 Task 4 on
Hyperpartisan News Detection. Our system relies on a variety of engineered
features originally used to detect propaganda. This is based on the assumption
that biased messages are propagandistic in the sense that they promote a
particular political cause or viewpoint. We trained a logistic regression model
with features ranging from simple bag-of-words to vocabulary richness and text
readability features. Our system achieved 72.9% accuracy on the test data that
is annotated manually and 60.8% on the test data that is annotated with distant
supervision. Additional experiments showed that significant performance
improvements can be achieved with better feature pre-processing.Comment: Hyperpartisanship, propaganda, news media, fake news, SemEval-201
False News On Social Media: A Data-Driven Survey
In the past few years, the research community has dedicated growing interest
to the issue of false news circulating on social networks. The widespread
attention on detecting and characterizing false news has been motivated by
considerable backlashes of this threat against the real world. As a matter of
fact, social media platforms exhibit peculiar characteristics, with respect to
traditional news outlets, which have been particularly favorable to the
proliferation of deceptive information. They also present unique challenges for
all kind of potential interventions on the subject. As this issue becomes of
global concern, it is also gaining more attention in academia. The aim of this
survey is to offer a comprehensive study on the recent advances in terms of
detection, characterization and mitigation of false news that propagate on
social media, as well as the challenges and the open questions that await
future research on the field. We use a data-driven approach, focusing on a
classification of the features that are used in each study to characterize
false information and on the datasets used for instructing classification
methods. At the end of the survey, we highlight emerging approaches that look
most promising for addressing false news
Recommended from our members
Modeling the fake news challenge as a cross-level stance detection task
The 2017 Fake News Challenge Stage 1, a shared task for stance detection of news articles and claims pairs, has received a lot of attention in recent years [3]. The provided dataset is highly unbalanced, with a skewed distribution towards unrelated samples - that is, randomly generated pairs of news and claims belonging to different topics. This imbalance favored systems which performed particularly well in classifying those noisy samples, something which does not require a deep semantic understanding.
In this paper, we propose a simple architecture based on conditional encoding, carefully designed to model the internal structure of a news article and its relations with a claim. We demonstrate that our model, which only leverages information from word embeddings, can outperform a system based on a large number of hand-engineered features, which replicates one of the winning systems at the Fake News Challenge [6], in the stance detection of the related samples
TIB's Visual Analytics Group at MediaEval '20: Detecting Fake News on Corona Virus and 5G Conspiracy
Fake news on social media has become a hot topic of research as it negatively
impacts the discourse of real news in the public. Specifically, the ongoing
COVID-19 pandemic has seen a rise of inaccurate and misleading information due
to the surrounding controversies and unknown details at the beginning of the
pandemic. The FakeNews task at MediaEval 2020 tackles this problem by creating
a challenge to automatically detect tweets containing misinformation based on
text and structure from Twitter follower network. In this paper, we present a
simple approach that uses BERT embeddings and a shallow neural network for
classifying tweets using only text, and discuss our findings and limitations of
the approach in text-based misinformation detection.Comment: MediaEval 2020 Fake News Tas
- …