4 research outputs found
Stance Detection in Web and Social Media: A Comparative Study
Online forums and social media platforms are increasingly being used to
discuss topics of varying polarities where different people take different
stances. Several methodologies for automatic stance detection from text have
been proposed in literature. To our knowledge, there has not been any
systematic investigation towards their reproducibility, and their comparative
performances. In this work, we explore the reproducibility of several existing
stance detection models, including both neural models and classical
classifier-based models. Through experiments on two datasets -- (i)~the popular
SemEval microblog dataset, and (ii)~a set of health-related online news
articles -- we also perform a detailed comparative analysis of various methods
and explore their shortcomings. Implementations of all algorithms discussed in
this paper are available at
https://github.com/prajwal1210/Stance-Detection-in-Web-and-Social-Media
On Left and Right: Understanding the Discourse of Presidential Election in Social Media Communities
As a promising platform for political discourse, social media becomes a battleground for presidential candidates as well as their supporters and opponents. Stance detection is one of the key tasks in the understanding of political discourse. However, existing methods are dominated by supervised techniques, which require labeled data. Previous work on stance detection is largely conducted at the post or user level. Despite that some studies have considered online political communities, they either only select a few communities or assume the stance coherence of these communities. Political party extraction has rarely been addressed explicitly. To address the limitations, we developed an unsupervised learning approach to political party extraction and stance detection from social media discourse. We also analyzed and compared (sub)communities with respect to their characteristics of political stances and parties. We further explored (sub)communities’ shift in political stance after the 2020 US presidential election
Combining Text Classification and Fact Checking to Detect Fake News
Due to the widespread use of fake news in social and news media, it is an emerging research
topic gaining attention in today‘s world. In news media and social media, information is
spread at high speed but without accuracy, and therefore detection mechanisms should be
able to predict news quickly enough to combat the spread of fake news. It has the potential
for a negative impact on individuals and society. Therefore, detecting fake news is important
and also a technically challenging problem nowadays. The challenge is to use text
classification to combat fake news. This includes determining appropriate text classification
methods and evaluating how good these methods are at distinguishing between fake and non-
fake news. Machine learning is helpful for building Artificial intelligence systems based on
tacit knowledge because it can help us solve complex problems based on real-world data. For
this reason, I proposed that integrating text classification and fact checking of check-worthy
statements can be helpful in detecting fake news. I used text processing and three classifiers
such as Passive Aggressive, Naïve Bayes, and Support Vector Machine to classify the news
data. Text classification mainly focuses on extracting various features from texts and then
incorporating these features into the classification. The big challenge in this area is the lack of
an efficient method to distinguish between fake news and non-fake news due to the lack of
corpora. I applied three different machine learning classifiers to two publicly available
datasets. Experimental analysis based on the available dataset shows very encouraging and
improved performance. Simple classification is not quite accurate in detecting fake news
because the classification methods are not specialized for fake news. So I added a system that
checks the news in depth sentence by sentence. Fact checking is a multi-step process that
begins with the extraction of check-worthy statements. Identification of check-worthy
statements is a subtask in the fact checking process, the automation of which would reduce
the time and effort required to fact check a statement. In this thesis I have proposed an
approach that focuses on classifying statements into check-worthy and not check-worthy,
while also taking into account the context around a statement. This work shows that inclusion
of context in the approach makes a significant contribution to classification, while at the same
time using more general features to capture information from sentences. The aim of thischallenge is to propose an approach that automatically identifies check-worthy statements for
fact checking, including the context around a statement. The results are analyzed by
examining which features contributes more to classification, but also how well the approach
performs. For this work, a dataset is created by consulting different fact checking
organizations. It contains debates and speeches in the domain of politics. The capability of
the approach is evaluated in this domain. The approach starts with extracting sentence and
context features from the sentences, and then classifying the sentences based on these
features. The feature set and context features are selected after several experiments, based on
how well they differentiate check-worthy statements. Fact checking has received increasing
attention after the 2016 United States Presidential election; so far that many efforts have been
made to develop a viable automated fact checking system. I introduced a web based approach
for fact checking that compares the full news text and headline with known facts such as
name, location, and place. The challenge is to develop an automated application that takes
claims directly from mainstream news media websites and fact checks the news after
applying classification and fact checking components. For fact checking a dataset is
constructed that contains 2146 news articles labelled fake, non-fake and unverified. I include
forty mainstream news media sources to compare the results and also Wikipedia for double
verification. This work shows that a combination of text classification and fact checking
gives considerable contribution to the detection of fake news, while also using more general
features to capture information from sentences