107 research outputs found
Arabic Sentiment Analysis with Noisy Deep Explainable Model
Sentiment Analysis (SA) is an indispensable task for many real-world
applications. Compared to limited resourced languages (i.e., Arabic, Bengali),
most of the research on SA are conducted for high resourced languages (i.e.,
English, Chinese). Moreover, the reasons behind any prediction of the Arabic
sentiment analysis methods exploiting advanced artificial intelligence
(AI)-based approaches are like black-box - quite difficult to understand. This
paper proposes an explainable sentiment classification framework for the Arabic
language by introducing a noise layer on Bi-Directional Long Short-Term Memory
(BiLSTM) and Convolutional Neural Networks (CNN)-BiLSTM models that overcome
over-fitting problem. The proposed framework can explain specific predictions
by training a local surrogate explainable model to understand why a particular
sentiment (positive or negative) is being predicted. We carried out experiments
on public benchmark Arabic SA datasets. The results concluded that adding noise
layers improves the performance in sentiment analysis for the Arabic language
by reducing overfitting and our method outperformed some known state-of-the-art
methods. In addition, the introduced explainability with noise layer could make
the model more transparent and accountable and hence help adopting AI-enabled
system in practice.Comment: This is the pre-print version of our accepted paper at the 7th
International Conference on Natural Language Processing and Information
Retrieval~(ACM NLPIR'2023
Sentiment analysis of the Saudi Digital Library (SDL) tweets interactions
In July 2011, the Saudi Digital Library (SDL) created a Twitter account to serve as a primary means for customer interaction, support, and a Q&A page. The SDL account actively tweets about SDL news, recently-added databases, and training venues, dates, and times. It is interesting to see SDL users interact with the SDL account on Twitter, but how beneficial is it? This study investigates the reactions of people who use the SDL to SDL tweets via Twitter, using a manual sentiment content analysis approach to analyze the interactions. The content analysis consists of counting the number of likes and retweets, whether the questions posted receive answers, and lastly categorizing the sentiment expressed in tweets as 'positive,' 'negative,' and 'neutral.' The students' interaction with SDL through Twitter ranges between positive and neutral. Students seem to like tweets about news and instructions about the SDL. However, students do not seem to find solutions to the problems they are having; instead, they are directed elsewhere to find help
Arabic Opinion Mining Using a Hybrid Recommender System Approach
Recommender systems nowadays are playing an important role in the delivery of
services and information to users. Sentiment analysis (also known as opinion
mining) is the process of determining the attitude of textual opinions, whether
they are positive, negative or neutral. Data sparsity is representing a big
issue for recommender systems because of the insufficiency of user rating or
absence of data about users or items. This research proposed a hybrid approach
combining sentiment analysis and recommender systems to tackle the problem of
data sparsity problems by predicting the rating of products from users reviews
using text mining and NLP techniques. This research focuses especially on
Arabic reviews, where the model is evaluated using Opinion Corpus for Arabic
(OCA) dataset. Our system was efficient, and it showed a good accuracy of
nearly 85 percent in predicting rating from review
A review on corpus annotation for arabic sentiment analysis
Mining publicly available data for meaning and value is an important
research direction within social media analysis. To automatically analyze
collected textual data, a manual effort is needed for a successful machine learning algorithm to effectively classify text. This pertains to annotating the text adding labels to each data entry. Arabic is one of the languages that are growing rapidly in the research of sentiment analysis, despite limited resources and scares annotated corpora. In this paper, we review the annotation process carried out by those papers. A total of 27 papers were reviewed between the
years of 2010 and 2016
Sentiment Analysis for micro-blogging platforms in Arabic
Sentiment Analysis (SA) concerns the automatic extraction and classification of
sentiments conveyed in a given text, i.e. labelling a text instance as positive, negative
or neutral. SA research has attracted increasing interest in the past few years due
to its numerous real-world applications. The recent interest in SA is also fuelled
by the growing popularity of social media platforms (e.g. Twitter), as they provide
large amounts of freely available and highly subjective content that can be readily
crawled.
Most previous SA work has focused on English with considerable success. In
this work, we focus on studying SA in Arabic, as a less-resourced language. This
work reports on a wide set of investigations for SA in Arabic tweets, systematically
comparing three existing approaches that have been shown successful in English.
Specifically, we report experiments evaluating fully-supervised-based (SL), distantsupervision-
based (DS), and machine-translation-based (MT) approaches for SA.
The investigations cover training SA models on manually-labelled (i.e. in SL methods)
and automatically-labelled (i.e. in DS methods) data-sets. In addition, we
explored an MT-based approach that utilises existing off-the-shelf SA systems for
English with no need for training data, assessing the impact of translation errors on
the performance of SA models, which has not been previously addressed for Arabic
tweets. Unlike previous work, we benchmark the trained models against an independent
test-set of >3.5k instances collected at different points in time to account
for topic-shifts issues in the Twitter stream. Despite the challenging noisy medium
of Twitter and the mixture use of Dialectal and Standard forms of Arabic, we show
that our SA systems are able to attain performance scores on Arabic tweets that
are comparable to the state-of-the-art SA systems for English tweets.
The thesis also investigates the role of a wide set of features, including syntactic,
semantic, morphological, language-style and Twitter-specific features. We introduce
a set of affective-cues/social-signals features that capture information about the
presence of contextual cues (e.g. prayers, laughter, etc.) to correlate them with the
sentiment conveyed in an instance. Our investigations reveal a generally positive
impact for utilising these features for SA in Arabic. Specifically, we show that a rich
set of morphological features, which has not been previously used, extracted using
a publicly-available morphological analyser for Arabic can significantly improve the
performance of SA classifiers. We also demonstrate the usefulness of languageindependent
features (e.g. Twitter-specific) for SA. Our feature-sets outperform
results reported in previous work on a previously built data-set
Recommended from our members
Sentiment analysis of dialectical Arabic social media content using a hybrid linguistic-machine learning approach
Despite the enormous increase in the number of Arabic posts on social networks, the sentiment analysis research into extracting opinions from these posts lags behind that for the English language. This is largely attributed to the challenges in processing the morphologically complex Arabic natural language and the scarcity of Arabic NLP tools and resources. This complex task is further exacerbated when analysing dialectal Arabic that do not abide by the formal grammatical structure. Based on the semantic modelling of the target domain’s knowledge and multi-factor lexicon-based sentiment analysis, the intent of this research is to use a hybrid approach, integrating linguistic and machine learning methods for sentiment analysis classification of dialectal Arabic. First, a dataset of dialectal Arabic tweets was collected focusing on the unemployment domain, which is annotated manually. The tweets cover different dialectal Arabic in Saudi Arabia for which a comprehensive Arabic sentiment lexicon was constructed. This approach to sentiment analysis also integrated a novel light stemming mechanism towards improved Saudi dialectal Arabic stemming. Subsequently, a novel multi-factor lexicon-based sentiment analysis algorithm was developed for domain-specific social media posts written in dialectal Arabic. The algorithm considers several factors (emoji, intensifiers, negations, supplications) to improve the accuracy of the classifications. Applying this model to a central problem of sentiment analysis in dialectical Arabic, these operational techniques were deployed in order to assess analytical performance across social media channels which are vulnerable to semantic and colloquial variations. Finally, this study presented a new hybrid approach to sentiment analysis where domain knowledge is utilised in two methods to combine computational linguistics and machine learning; the first method integrates the problem domain semantic knowledgebase in the machine learning training features set, while the second uses the outcome of the lexicon-based sentiment classification in the training of the machine learning methods. By integrating these techniques into a single, hybridised solution, a greater degree of accuracy and consistency was achieved than applying each approach independently, confirming a pragmatic solution to sentiment classification in dialectical Arabic text
- …