1,548 research outputs found

    Lexicons in Sentiment Analytics

    Get PDF

    Combining Sentiment Lexicons of Arabic Terms

    Get PDF
    Lexicons are dictionaries of sentiment words and their matching polarity. Some comprise words that are numerically scored based on the degree of positivity/negativity of the underlying sentiments. The ranges of scores differ since each lexicon has its own scoring process. Others use labelled words instead of scores with polarity tags (i.e., positive/negative/neutral). Lexicons are important in text mining and sentiment analysis which compels researchers to develop and publish them. Larger lexicons better train sentiment models thereby classifying sentiments in text more accurately. Hence, it is useful to combine the various available lexicons. Nevertheless, there exist many duplicates, overlaps and contradictions between these lexicons. In this paper, we define a method to combine different lexicons. We used the method to normalize and unify lexicon items and merge duplicated lexicon items from twelve lexicons for (in)formal Arabic. This resulted in a coherent Arabic sentiment lexicon with the largest number of terms

    An analysis of customer perception using lexicon-based sentiment analysis of Arabic Texts framework.

    Get PDF
    Sentiment Analysis (SA) employing Natural Language Processing (NLP) is pivotal in determining the positivity and negativity of customer feedback. Although significant research in SA is focused on English texts, there is a growing demand for SA in other widely spoken languages, such as Arabic. This is predominantly due to the global reach of social media which enables users to express opinions on products in any language and, in turn, necessitates a thorough understanding of customers' perceptions of new products based on social media conversations. However, the current research studies demonstrate inadequacies in furnishing text analysis for comprehending the perceptions of Arabic customers towards coffee and coffee products. Therefore, this study proposes a comprehensive Lexicon-based Sentiment Analysis on Arabic Texts (LSAnArTe) framework applied to social media data, to understand customer perceptions of coffee, a widely consumed product in the Arabic-speaking world. The LSAnArTe Framework incorporates the existing AraSenTi dictionary, an Arabic database of sentiment scores for Arabic words, and lemmatizes unknown words using the Qalasadi open platform. It classifies each word as positive, negative or neutral before conducting sentence-level sentiment classification. Data collected from X (formerly known as Twitter, resulted in a cleaned dataset of 10,769 tweets, is used to validate the proposed framework, which is then compared with Amazon Comprehend. The dataset was annotated manually to ensure maximum accuracy and reliability in validating the proposed LSAnArTe Framework. The results revealed that the proposed LSAnArTe Framework, with an accuracy score of 93.79 %, outperformed the Amazon Comprehend tool, which had an accuracy of 51.90 %

    MoralStrength: Exploiting a Moral Lexicon and Embedding Similarity for Moral Foundations Prediction

    Get PDF
    Moral rhetoric plays a fundamental role in how we perceive and interpret the information we receive, greatly influencing our decision-making process. Especially when it comes to controversial social and political issues, our opinions and attitudes are hardly ever based on evidence alone. The Moral Foundations Dictionary (MFD) was developed to operationalize moral values in the text. In this study, we present MoralStrength, a lexicon of approximately 1,000 lemmas, obtained as an extension of the Moral Foundations Dictionary, based on WordNet synsets. Moreover, for each lemma it provides with a crowdsourced numeric assessment of Moral Valence, indicating the strength with which a lemma is expressing the specific value. We evaluated the predictive potentials of this moral lexicon, defining three utilization approaches of increased complexity, ranging from lemmas' statistical properties to a deep learning approach of word embeddings based on semantic similarity. Logistic regression models trained on the features extracted from MoralStrength, significantly outperformed the current state-of-the-art, reaching an F1-score of 87.6% over the previous 62.4% (p-value<0.01), and an average F1-Score of 86.25% over six different datasets. Such findings pave the way for further research, allowing for an in-depth understanding of moral narratives in text for a wide range of social issues

    Sentiment analytics: Lexicons construction and analysis

    Get PDF
    With the increasing amount of text data, sentiment analysis (SA) is becoming more and more important. An automated approach is needed to parse the online reviews and comments, and analyze their sentiments. Since lexicon is the most important component in SA, enhancing the quality of lexicons will improve the efficiency and accuracy of sentiment analysis. In this research, the effect of coupling a general lexicon with a specialized lexicon (for a specific domain) and its impact on sentiment analysis was presented. Two special domains and one general domain were studied. The two special domains are the petroleum domain and the biology domain. The general domain is the social network domain. The specialized lexicon for the petroleum domain was created as part of this research. The results, as expected, show that coupling a general lexicon with a specialized lexicon improves the sentiment analysis. However, coupling a general lexicon with another general lexicon does not improve the sentiment analysis --Abstract, page iii
    • …