1,548 research outputs found
Combining Sentiment Lexicons of Arabic Terms
Lexicons are dictionaries of sentiment words and their matching polarity. Some comprise words that are numerically scored based on the degree of positivity/negativity of the underlying sentiments. The ranges of scores differ since each lexicon has its own scoring process. Others use labelled words instead of scores with polarity tags (i.e., positive/negative/neutral). Lexicons are important in text mining and sentiment analysis which compels researchers to develop and publish them. Larger lexicons better train sentiment models thereby classifying sentiments in text more accurately. Hence, it is useful to combine the various available lexicons. Nevertheless, there exist many duplicates, overlaps and contradictions between these lexicons. In this paper, we define a method to combine different lexicons. We used the method to normalize and unify lexicon items and merge duplicated lexicon items from twelve lexicons for (in)formal Arabic. This resulted in a coherent Arabic sentiment lexicon with the largest number of terms
Recommended from our members
What\u27s in a letter?
Sentiment analysis is a burgeoning field in natural language processing used to extract and categorize opinion in evaluative documents. We look at recommendation letters, which pose unique challenges to standard sentiment analysis systems. Our dataset is eighteen letters from applications to UMass Worcester Memorial Medical Center’s residency program in Obstetrics and Gynecology. Given a small dataset, we develop a method intended for use by domain experts to systematically explore their intuitions about the topical make-up of documents on which they make critical decisions. By leveraging WordNet and the WordNet Propagation algorithm, the method allows a user to develop topic seed sets from real data and propagate them into robust lexicons for use on new data. We show how one pass through the method yields useful feedback to our beliefs about the make-up of recommendation letters. At the end, future directions are outlined which assume a fuller dataset
An analysis of customer perception using lexicon-based sentiment analysis of Arabic Texts framework.
Sentiment Analysis (SA) employing Natural Language Processing (NLP) is pivotal in determining the positivity and negativity of customer feedback. Although significant research in SA is focused on English texts, there is a growing demand for SA in other widely spoken languages, such as Arabic. This is predominantly due to the global reach of social media which enables users to express opinions on products in any language and, in turn, necessitates a thorough understanding of customers' perceptions of new products based on social media conversations. However, the current research studies demonstrate inadequacies in furnishing text analysis for comprehending the perceptions of Arabic customers towards coffee and coffee products. Therefore, this study proposes a comprehensive Lexicon-based Sentiment Analysis on Arabic Texts (LSAnArTe) framework applied to social media data, to understand customer perceptions of coffee, a widely consumed product in the Arabic-speaking world. The LSAnArTe Framework incorporates the existing AraSenTi dictionary, an Arabic database of sentiment scores for Arabic words, and lemmatizes unknown words using the Qalasadi open platform. It classifies each word as positive, negative or neutral before conducting sentence-level sentiment classification. Data collected from X (formerly known as Twitter, resulted in a cleaned dataset of 10,769 tweets, is used to validate the proposed framework, which is then compared with Amazon Comprehend. The dataset was annotated manually to ensure maximum accuracy and reliability in validating the proposed LSAnArTe Framework. The results revealed that the proposed LSAnArTe Framework, with an accuracy score of 93.79Â %, outperformed the Amazon Comprehend tool, which had an accuracy of 51.90Â %
MoralStrength: Exploiting a Moral Lexicon and Embedding Similarity for Moral Foundations Prediction
Moral rhetoric plays a fundamental role in how we perceive and interpret the
information we receive, greatly influencing our decision-making process.
Especially when it comes to controversial social and political issues, our
opinions and attitudes are hardly ever based on evidence alone. The Moral
Foundations Dictionary (MFD) was developed to operationalize moral values in
the text. In this study, we present MoralStrength, a lexicon of approximately
1,000 lemmas, obtained as an extension of the Moral Foundations Dictionary,
based on WordNet synsets. Moreover, for each lemma it provides with a
crowdsourced numeric assessment of Moral Valence, indicating the strength with
which a lemma is expressing the specific value. We evaluated the predictive
potentials of this moral lexicon, defining three utilization approaches of
increased complexity, ranging from lemmas' statistical properties to a deep
learning approach of word embeddings based on semantic similarity. Logistic
regression models trained on the features extracted from MoralStrength,
significantly outperformed the current state-of-the-art, reaching an F1-score
of 87.6% over the previous 62.4% (p-value<0.01), and an average F1-Score of
86.25% over six different datasets. Such findings pave the way for further
research, allowing for an in-depth understanding of moral narratives in text
for a wide range of social issues
Sentiment analytics: Lexicons construction and analysis
With the increasing amount of text data, sentiment analysis (SA) is becoming more and more important. An automated approach is needed to parse the online reviews and comments, and analyze their sentiments. Since lexicon is the most important component in SA, enhancing the quality of lexicons will improve the efficiency and accuracy of sentiment analysis. In this research, the effect of coupling a general lexicon with a specialized lexicon (for a specific domain) and its impact on sentiment analysis was presented. Two special domains and one general domain were studied. The two special domains are the petroleum domain and the biology domain. The general domain is the social network domain. The specialized lexicon for the petroleum domain was created as part of this research. The results, as expected, show that coupling a general lexicon with a specialized lexicon improves the sentiment analysis. However, coupling a general lexicon with another general lexicon does not improve the sentiment analysis --Abstract, page iii
- …