498 research outputs found
Semantic Sentiment Analysis of Twitter Data
Internet and the proliferation of smart mobile devices have changed the way
information is created, shared, and spreads, e.g., microblogs such as Twitter,
weblogs such as LiveJournal, social networks such as Facebook, and instant
messengers such as Skype and WhatsApp are now commonly used to share thoughts
and opinions about anything in the surrounding world. This has resulted in the
proliferation of social media content, thus creating new opportunities to study
public opinion at a scale that was never possible before. Naturally, this
abundance of data has quickly attracted business and research interest from
various fields including marketing, political science, and social studies,
among many others, which are interested in questions like these: Do people like
the new Apple Watch? Do Americans support ObamaCare? How do Scottish feel about
the Brexit? Answering these questions requires studying the sentiment of
opinions people express in social media, which has given rise to the fast
growth of the field of sentiment analysis in social media, with Twitter being
especially popular for research due to its scale, representativeness, variety
of topics discussed, as well as ease of public access to its messages. Here we
present an overview of work on sentiment analysis on Twitter.Comment: Microblog sentiment analysis; Twitter opinion mining; In the
Encyclopedia on Social Network Analysis and Mining (ESNAM), Second edition.
201
A Deep Learning Approach for Robust Detection of Bots in Twitter Using Transformers
© 2021 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other worksDuring the last decades, the volume of multimedia content posted in social networks has grown exponentially and such information is immediately propagated and consumed by a significant number of users. In this scenario, the disruption of fake news providers and bot accounts for spreading propaganda information as well as sensitive content throughout the network has fostered applied researh to automatically measure the reliability of social networks accounts via Artificial Intelligence (AI). In this paper, we present a multilingual approach for addressing the bot identification task in Twitter via Deep learning (DL) approaches to support end-users when checking the credibility of a certain Twitter account. To do so, several experiments were conducted using state-of-the-art Multilingual Language Models to generate an encoding of the text-based features of the user account that are later on concatenated with the rest of the metadata to build a potential input vector on top of a Dense Network denoted as Bot-DenseNet. Consequently, this paper assesses the language constraint from previous studies where the encoding of the user account only considered either the metadatainformation or the metadata information together with some basic semantic text features. Moreover, the Bot-DenseNet produces a low-dimensional representation of the user account which can be used for any application within the Information Retrieval (IR) framewor
PREDICTING COLLECTIVE VIOLENCE FROM COORDINATED HOSTILE INFORMATION CAMPAIGNS IN SOCIAL MEDIA
The ability to predict conflicts prior to their occurrence can help deter the outbreak of collective violence and avoid human suffering. Existing approaches use statistical and machine learning models, and even social network analysis techniques; however, they are generally confined to long-range predictions in specific regions and are based on only a few languages. Understanding collective violence from signals in multiple or mixed languages in social media remains understudied. In this work, we construct a multilingual language model (MLLM) that can accept input from any language in social media, a model that is language-agnostic in nature. The purpose of this study is twofold. First, it aims to collect a multilingual violence corpus from archived Twitter data using a proposed set of heuristics that account for spatial-temporal features around past and future violent events. And second, it attempts to compare the performance of traditional machine learning classifiers against deep learning MLLMs for predicting message classes linked to past and future occurrences of violent events. Our findings suggest that MLLMs substantially outperform traditional ML models in predictive accuracy. One major contribution of our work is that military commands now have a tool to evaluate and learn the language of violence across all human languages. Finally, we made the data, code, and models publicly available.Outstanding ThesisCommander, Ecuadorian NavyApproved for public release. Distribution is unlimited
Exploiting Emotions via Composite Pretrained Embedding and Ensemble Language Model
Decisions in the modern era are based on more than just the available data; they also incorporate feedback from online sources. Processing reviews known as Sentiment analysis (SA) or Emotion analysis. Understanding the user's perspective and routines is crucial now-a-days for multiple reasons. It is used by both businesses and governments to make strategic decisions. Various architectural and vector embedding strategies have been developed for SA processing. Accurate representation of text is crucial for automatic SA. Due to the large number of languages spoken and written, polysemy and syntactic or semantic issues were common. To get around these problems, we developed effective composite embedding (ECE), a method that combines the advantages of vector embedding techniques that are either context-independent (like glove & fasttext) or context-aware (like XLNet) to effectively represent the features needed for processing. To improve the performace towards emotion or sentiment we proposed stacked ensemble model of deep lanugae models.ECE with Ensembled model is evaluated on balanced dataset to prove that it is a reliable embedding technique and a generalised model for SA.In order to evaluate ECE, cutting-edge ML and Deep net language models are deployed and comapared. The model is evaluated using benchmark datset such as MR, Kindle along with realtime tweet dataset of user complaints . LIME is used to verify the model's predictions and to provide statistical results for sentence.The model with ECE embedding provides state-of-art results with real time dataset as well
BERT-Deep CNN: State-of-the-Art for Sentiment Analysis of COVID-19 Tweets
The free flow of information has been accelerated by the rapid development of
social media technology. There has been a significant social and psychological
impact on the population due to the outbreak of Coronavirus disease (COVID-19).
The COVID-19 pandemic is one of the current events being discussed on social
media platforms. In order to safeguard societies from this pandemic, studying
people's emotions on social media is crucial. As a result of their particular
characteristics, sentiment analysis of texts like tweets remains challenging.
Sentiment analysis is a powerful text analysis tool. It automatically detects
and analyzes opinions and emotions from unstructured data. Texts from a wide
range of sources are examined by a sentiment analysis tool, which extracts
meaning from them, including emails, surveys, reviews, social media posts, and
web articles. To evaluate sentiments, natural language processing (NLP) and
machine learning techniques are used, which assign weights to entities, topics,
themes, and categories in sentences or phrases. Machine learning tools learn
how to detect sentiment without human intervention by examining examples of
emotions in text. In a pandemic situation, analyzing social media texts to
uncover sentimental trends can be very helpful in gaining a better
understanding of society's needs and predicting future trends. We intend to
study society's perception of the COVID-19 pandemic through social media using
state-of-the-art BERT and Deep CNN models. The superiority of BERT models over
other deep models in sentiment analysis is evident and can be concluded from
the comparison of the various research studies mentioned in this article.Comment: 20 pages, 5 figure
Data Boost: Text Data Augmentation Through Reinforcement Learning Guided Conditional Generation
Data augmentation is proven to be effective in many NLU tasks, especially for
those suffering from data scarcity. In this paper, we present a powerful and
easy to deploy text augmentation framework, Data Boost, which augments data
through reinforcement learning guided conditional generation. We evaluate Data
Boost on three diverse text classification tasks under five different
classifier architectures. The result shows that Data Boost can boost the
performance of classifiers especially in low-resource data scenarios. For
instance, Data Boost improves F1 for the three tasks by 8.7% on average when
given only 10% of the whole data for training. We also compare Data Boost with
six prior text augmentation methods. Through human evaluations (N=178), we
confirm that Data Boost augmentation has comparable quality as the original
data with respect to readability and class consistency.Comment: In proceedings of the 2020 Conference on Empirical Methods in Natural
Language Processing (EMNLP 2020). Onlin
- …