2,085 research outputs found
A Multimodal Approach to Sarcasm Detection on Social Media
In recent times, a major share of human communication takes place online. The main reason being the ease of communication on social networking sites (SNSs). Due to the variety and large number of users, SNSs have drawn the attention of the computer science (CS) community, particularly the affective computing (also known as emotional AI), information retrieval, natural language processing, and data mining groups. Researchers are trying to make computers understand the nuances of human communication including sentiment and sarcasm. Emotion or sentiment detection requires more insights about the communication than it does for factual information retrieval. Sarcasm detection is particularly more difficult than categorizing sentiment. Because, in sarcasm, the intended meaning of the expression by the user is opposite to the literal meaning. Because of its complex nature, it is often difficult even for human to detect sarcasm without proper context. However, people on social media succeed in detecting sarcasm despite interacting with strangers across the world. That motivates us to investigate the human process of detecting sarcasm on social media where abundant context information is often unavailable and the group of users communicating with each other are rarely well-acquainted. We have conducted a qualitative study to examine the patterns of users conveying sarcasm on social media. Whereas most sarcasm detection systems deal in word-by-word basis to accomplish their goal, we focused on the holistic sentiment conveyed by the post. We argue that utilization of word-level information will limit the systems performance to the domain of the dataset used to train the system and might not perform well for non-English language. As an endeavor to make our system less dependent on text data, we proposed a multimodal approach for sarcasm detection. We showed the applicability of images and reaction emoticons as other sources of hints about the sentiment of the post. Our research showed the superior results from a multimodal approach when compared to a unimodal approach. Multimodal sarcasm detection systems, as the one presented in this research, with the inclusion of more modes or sources of data might lead to a better sarcasm detection model
Stance detection on social media: State of the art and trends
Stance detection on social media is an emerging opinion mining paradigm for
various social and political applications in which sentiment analysis may be
sub-optimal. There has been a growing research interest for developing
effective methods for stance detection methods varying among multiple
communities including natural language processing, web science, and social
computing. This paper surveys the work on stance detection within those
communities and situates its usage within current opinion mining techniques in
social media. It presents an exhaustive review of stance detection techniques
on social media, including the task definition, different types of targets in
stance detection, features set used, and various machine learning approaches
applied. The survey reports state-of-the-art results on the existing benchmark
datasets on stance detection, and discusses the most effective approaches. In
addition, this study explores the emerging trends and different applications of
stance detection on social media. The study concludes by discussing the gaps in
the current existing research and highlights the possible future directions for
stance detection on social media.Comment: We request withdrawal of this article sincerely. We will re-edit this
paper. Please withdraw this article before we finish the new versio
Overview of the Shared Task on Fake News Detection in Urdu at FIRE 2021
Automatic detection of fake news is a highly important task in the
contemporary world. This study reports the 2nd shared task called
UrduFake@FIRE2021 on identifying fake news detection in Urdu. The goal of the
shared task is to motivate the community to come up with efficient methods for
solving this vital problem, particularly for the Urdu language. The task is
posed as a binary classification problem to label a given news article as a
real or a fake news article. The organizers provide a dataset comprising news
in five domains: (i) Health, (ii) Sports, (iii) Showbiz, (iv) Technology, and
(v) Business, split into training and testing sets. The training set contains
1300 annotated news articles -- 750 real news, 550 fake news, while the testing
set contains 300 news articles -- 200 real, 100 fake news. 34 teams from 7
different countries (China, Egypt, Israel, India, Mexico, Pakistan, and UAE)
registered to participate in the UrduFake@FIRE2021 shared task. Out of those,
18 teams submitted their experimental results, and 11 of those submitted their
technical reports, which is substantially higher compared to the UrduFake
shared task in 2020 when only 6 teams submitted their technical reports. The
technical reports submitted by the participants demonstrated different data
representation techniques ranging from count-based BoW features to word vector
embeddings as well as the use of numerous machine learning algorithms ranging
from traditional SVM to various neural network architectures including
Transformers such as BERT and RoBERTa. In this year's competition, the best
performing system obtained an F1-macro score of 0.679, which is lower than the
past year's best result of 0.907 F1-macro. Admittedly, while training sets from
the past and the current years overlap to a large extent, the testing set
provided this year is completely different
Social media bot detection with deep learning methods: a systematic review
Social bots are automated social media accounts governed by software and controlled by humans at the backend. Some bots have good purposes, such as automatically posting information about news and even to provide help during emergencies. Nevertheless, bots have also been used for malicious purposes, such as for posting fake news or rumour spreading or manipulating political campaigns. There are existing mechanisms that allow for detection and removal of malicious bots automatically. However, the bot landscape changes as the bot creators use more sophisticated methods to avoid being detected. Therefore, new mechanisms for discerning between legitimate and bot accounts are much needed. Over the past few years, a few review studies contributed to the social media bot detection research by presenting a comprehensive survey on various detection methods including cutting-edge solutions like machine learning (ML)/deep learning (DL) techniques. This paper, to the best of our knowledge, is the first one to only highlight the DL techniques and compare the motivation/effectiveness of these techniques among themselves and over other methods, especially the traditional ML ones. We present here a refined taxonomy of the features used in DL studies and details about the associated pre-processing strategies required to make suitable training data for a DL model. We summarize the gaps addressed by the review papers that mentioned about DL/ML studies to provide future directions in this field. Overall, DL techniques turn out to be computation and time efficient techniques for social bot detection with better or compatible performance as traditional ML techniques
Mapping (Dis-)Information Flow about the MH17 Plane Crash
Digital media enables not only fast sharing of information, but also
disinformation. One prominent case of an event leading to circulation of
disinformation on social media is the MH17 plane crash. Studies analysing the
spread of information about this event on Twitter have focused on small,
manually annotated datasets, or used proxys for data annotation. In this work,
we examine to what extent text classifiers can be used to label data for
subsequent content analysis, in particular we focus on predicting pro-Russian
and pro-Ukrainian Twitter content related to the MH17 plane crash. Even though
we find that a neural classifier improves over a hashtag based baseline,
labeling pro-Russian and pro-Ukrainian content with high precision remains a
challenging problem. We provide an error analysis underlining the difficulty of
the task and identify factors that might help improve classification in future
work. Finally, we show how the classifier can facilitate the annotation task
for human annotators
Rumor Stance Classification in Online Social Networks: A Survey on the State-of-the-Art, Prospects, and Future Challenges
The emergence of the Internet as a ubiquitous technology has facilitated the
rapid evolution of social media as the leading virtual platform for
communication, content sharing, and information dissemination. In spite of
revolutionizing the way news used to be delivered to people, this technology
has also brought along with itself inevitable demerits. One such drawback is
the spread of rumors facilitated by social media platforms which may provoke
doubt and fear upon people. Therefore, the need to debunk rumors before their
wide spread has become essential all the more. Over the years, many studies
have been conducted to develop effective rumor verification systems. One aspect
of such studies focuses on rumor stance classification, which concerns the task
of utilizing users' viewpoints about a rumorous post to better predict the
veracity of a rumor. Relying on users' stances in rumor verification task has
gained great importance, for it has shown significant improvements in the model
performances. In this paper, we conduct a comprehensive literature review on
rumor stance classification in complex social networks. In particular, we
present a thorough description of the approaches and mark the top performances.
Moreover, we introduce multiple datasets available for this purpose and
highlight their limitations. Finally, some challenges and future directions are
discussed to stimulate further relevant research efforts.Comment: 13 pages, 2 figures, journa
Psychographic Traits Identification based on political ideology: An author analysis study on spanish politicians tweets posted in 2020
In general, people are usually more reluctant to follow advice and directions from politicians who do not have their ideology. In extreme cases, people can be heavily biased in favour of a political party at the same time that they are in sharp disagreement with others, which may lead to irrational decision making and can put peopleâs lives at risk by ignoring certain recommendations from the authorities. Therefore, considering political ideology as a psychographic trait can improve political micro-targeting by helping public authorities and local governments to adopt better communication policies during crises. In this work, we explore the reliability of determining psychographic traits concerning political ideology. Our contribution is twofold. On the one hand, we release the PoliCorpus-2020, a dataset composed by Spanish politiciansâ tweets posted in 2020. On the other hand, we conduct two authorship analysis tasks with the aforementioned dataset: an author profiling task to extract demographic and psychographic traits, and an authorship attribution task to determine the author of an anonymous text in the political domain. Both experiments are evaluated with several neural network architectures grounded on explainable linguistic features, statistical features, and state-of-the-art transformers. In addition, we test whether the neural network models can be transferred to detect the political ideology of citizens. Our results indicate that the linguistic features are good indicators for identifying finegrained political affiliation, they boost the performance of neural network models when combined with embedding-based features, and they preserve relevant information when the models are tested with ordinary citizens. Besides, we found that lexical and morphosyntactic features are more effective on author profiling, whereas stylometric features are more effective in authorship attribution.publishedVersio
Investigating Online Financial Misinformation and Its Consequences: A Computational Perspective
The rapid dissemination of information through digital platforms has
revolutionized the way we access and consume news and information, particularly
in the realm of finance. However, this digital age has also given rise to an
alarming proliferation of financial misinformation, which can have detrimental
effects on individuals, markets, and the overall economy. This research paper
aims to provide a comprehensive survey of online financial misinformation,
including its types, sources, and impacts. We first discuss the characteristics
and manifestations of financial misinformation, encompassing false claims and
misleading content. We explore various case studies that illustrate the
detrimental consequences of financial misinformation on the economy. Finally,
we highlight the potential impact and implications of detecting financial
misinformation. Early detection and mitigation strategies can help protect
investors, enhance market transparency, and preserve financial stability. We
emphasize the importance of greater awareness, education, and regulation to
address the issue of online financial misinformation and safeguard individuals
and businesses from its harmful effects. In conclusion, this research paper
sheds light on the pervasive issue of online financial misinformation and its
wide-ranging consequences. By understanding the types, sources, and impacts of
misinformation, stakeholders can work towards implementing effective detection
and prevention measures to foster a more informed and resilient financial
ecosystem.Comment: 32 pages, 2 figure
- âŠ