158 research outputs found
Twitter Sentiment Analysis via Bi-sense Emoji Embedding and Attention-based LSTM
Sentiment analysis on large-scale social media data is important to bridge
the gaps between social media contents and real world activities including
political election prediction, individual and public emotional status
monitoring and analysis, and so on. Although textual sentiment analysis has
been well studied based on platforms such as Twitter and Instagram, analysis of
the role of extensive emoji uses in sentiment analysis remains light. In this
paper, we propose a novel scheme for Twitter sentiment analysis with extra
attention on emojis. We first learn bi-sense emoji embeddings under positive
and negative sentimental tweets individually, and then train a sentiment
classifier by attending on these bi-sense emoji embeddings with an
attention-based long short-term memory network (LSTM). Our experiments show
that the bi-sense embedding is effective for extracting sentiment-aware
embeddings of emojis and outperforms the state-of-the-art models. We also
visualize the attentions to show that the bi-sense emoji embedding provides
better guidance on the attention mechanism to obtain a more robust
understanding of the semantics and sentiments
Klasifikasi Respons Terhadap Vaksinasi Covid-19 Berdasarkan Tweets Menggunakan Attention-Based Long Short Term Memory
Media sosial memudahkan masyarakat dalam mendapatkan informasi dan menuangkan pendapat, saran atau kritiknya dalam peristiwa tertentu. Vaksinasi virus COVID-19 di Indonesia yang sedang hangat diperbicangkan dan mendapatkan beragam respons dari masyarakat baik pro maupun kontra, dapat dimanfaatkan untuk melakukan analisis terhadap respons tersebut. Untuk mendukung analisis tersebut, dilakukan klasifikasi respons dari masyarakat Indonesia terhadap vaksinasi COVID-19 menjadi tiga kelas yaitu negatif, netral, dan positif. Untuk proses klasifikasi respons tersebut, diimplementasikan metode Attentional-based Long Short Term Memory atau A-LSTM. Disisi lain, penelitian ini juga mengimplementasikan Bidirectional Encoder Representation Transformer (BERT) sebagai metode pada proses tokenisasi untuk memperoleh representasi fitur dari data Tweet sehingga membantu proses pelatihan A-LSTM. Proses evaluasi dilakukan dengan menggunakan dataset Tweets Bahasa Indonesia dari media sosial Twitter dimulai dari diangkatnya isu vaksinasi COVID-19 di Indonesia. Hasil dari metode ini menunjukkan kinerja yang baik dengan nilai akurasi sebesar 82%
A Survey of Sentiment Analysis and Sarcasm Detection: Challenges, Techniques, and Trends
In recent years, more people have been using the internet and social media to express their opinions on various subjects, such as institutions, services, or specific ideas. This increase highlights the importance of developing automated tools for accurate sentiment analysis. Moreover, addressing sarcasm in text is crucial, as it can significantly impact the efficacy of sentiment analysis models. This paper aims to provide a comprehensive overview of the conducted research on sentiment analysis and sarcasm detection, focusing on the time from 2018 to 2023. It explores the challenges faced and the methods used to address them. It conducts a comparison of these methods. It also aims to identify emerging trends that will likely influence the future of sentiment analysis and sarcasm detection, ensuring their continued effectiveness. This paper enhances the existing knowledge by offering a comprehensive analysis of 40 research works, evaluating performance, addressing multilingual challenges, and highlighting future trends in sarcasm detection and sentiment analysis. It is a valuable resource for researchers and experts interested in the field, facilitating further advancements in sentiment analysis techniques and applications. It categorizes sentiment analysis methods into ML, lexical, and hybrid approaches, highlighting deep learning, especially Recurrent Neural Networks (RNNs), for effective textual classification with labeled or unlabeled data
Understanding Emojis for Financial Sentiment Analysis
Social media content has been widely used for financial forecasting and sentiment analysis. However, emojis as a new “lingua franca” on social media are often omitted during standard data pre-processing processes, we thus speculate that they may carry additional useful information. In this research, we study the effect of emojis in facilitating financial sentiment analysis and explore the most effective way to handle them during model training. Experiments are conducted on two datasets from stock and crypto markets. Various machine learning models, deep learning models, and the state-of-the-art GPT-based model are used, and we compare their performances across different emoji encodings. Results show a consistent increase in model performances when emojis are converted to their descriptive phrases, and significant enhancements after refining the descriptive terms of the most important emojis before fitting them into the models. Our research shows that emojis are a valuable source for better understanding financial social media texts that cannot be omitted
Constructing Colloquial Dataset for Persian Sentiment Analysis of Social Microblogs
Introduction: Microblogging websites have massed rich data sources for
sentiment analysis and opinion mining. In this regard, sentiment classification
has frequently proven inefficient because microblog posts typically lack
syntactically consistent terms and representatives since users on these social
networks do not like to write lengthy statements. Also, there are some
limitations to low-resource languages. The Persian language has exceptional
characteristics and demands unique annotated data and models for the sentiment
analysis task, which are distinctive from text features within the English
dialect. Method: This paper first constructs a user opinion dataset called
ITRC-Opinion by collaborative environment and insource way. Our dataset
contains 60,000 informal and colloquial Persian texts from social microblogs
such as Twitter and Instagram. Second, this study proposes a new deep
convolutional neural network (CNN) model for more effective sentiment analysis
of colloquial text in social microblog posts. The constructed datasets are used
to evaluate the presented model. Furthermore, some models, such as LSTM,
CNN-RNN, BiLSTM, and BiGRU with different word embeddings, including Fasttext,
Glove, and Word2vec, investigated our dataset and evaluated the results.
Results: The results demonstrate the benefit of our dataset and the proposed
model (72% accuracy), displaying meaningful improvement in sentiment
classification performance
Computational Sarcasm Analysis on Social Media: A Systematic Review
Sarcasm can be defined as saying or writing the opposite of what one truly
wants to express, usually to insult, irritate, or amuse someone. Because of the
obscure nature of sarcasm in textual data, detecting it is difficult and of
great interest to the sentiment analysis research community. Though the
research in sarcasm detection spans more than a decade, some significant
advancements have been made recently, including employing unsupervised
pre-trained transformers in multimodal environments and integrating context to
identify sarcasm. In this study, we aim to provide a brief overview of recent
advancements and trends in computational sarcasm research for the English
language. We describe relevant datasets, methodologies, trends, issues,
challenges, and tasks relating to sarcasm that are beyond detection. Our study
provides well-summarized tables of sarcasm datasets, sarcastic features and
their extraction methods, and performance analysis of various approaches which
can help researchers in related domains understand current state-of-the-art
practices in sarcasm detection.Comment: 50 pages, 3 tables, Submitted to 'Data Mining and Knowledge
Discovery' for possible publicatio
- …