1,054 research outputs found

    Analyzing the Effect of Negation in Sentiment Polarity of Facebook Dialectal Arabic Text

    Get PDF
    With the increase in the number of users on social networks, sentiment analysis has been gaining attention. Sentiment analysis establishes the aggregation of these opinions to inform researchers about attitudes towards products or topics. Social network data commonly contain authors’ opinions about specific subjects, such as people’s opinions towards steps taken to manage the COVID-19 pandemic. Usually, people use dialectal language in their posts on social networks. Dialectal language has obstacles that make opinion analysis a challenging process compared to working with standard language. For the Arabic language, Modern Standard Arabic tools (MSA) cannot be employed with social network data that contain dialectal language. Another challenge of the dialectal Arabic language is the polarity of opinionated words affected by inverters, such as negation, that tend to change the word’s polarity from positive to negative and vice versa. This work analyzes the effect of inverters on sentiment analysis of social network dialectal Arabic posts. It discusses the different reasons that hinder the trivial resolution of inverters. An experiment is conducted on a corpus of data collected from Facebook. However, the same work can be applied to other social network posts. The results show the impact that resolution of negation may have on the classification accuracy. The results show that the F1 score increases by 20% if negation is treated in the text

    Semantic Classification of Multidialectal Arabic Social Media

    Get PDF
    Arabic is one of the most widely used languages in the world, but due in part to its morphological and syntactic richness, resources for automated processing of Arabic are relatively rare. Arabic takes three primary forms: Classical Arabic as seen in the Qur’an and other classical texts; Modern Standard Arabic (MSA) as seen in newspapers, formal documents, and other written text intended for widespread distribution; and dialectal Arabic as used in common speech and informal communication. Social media posts are often written in informal language and may include non-standard spellings, abbreviations, emoticons, hashtags, and emojis. Dialectal Arabic is commonly used in social media. Semantic classification is the task of assigning a label to a text based on its primary semantic content. Given the increased use of dialectal Arabic on social media platforms in recent years, there is an urgent need for semantic classification of dialectal Arabic. Even compared to MSA there are few resources for automated processing of dialectal Arabic. The prior work dealing with automated processing of dialectal Arabic are limited to only one or two dialects. One of the major obstacles to doing semantic classification of multi-dialectal Arabic is the lack of a large, multi-dialectal, tagged corpus. To the best of our knowledge there are no automated processes for semantic classification of multi-dialectal Arabic social media texts. We gather a data set of more than one million tweets collected from 449 accounts located in 12 Arabic-speaking countries. We group those tweets into 21,791 documents by country, account, and month. We first construct a query to represent a particular semantic concept. Then, using Latent Semantic Analysis (LSA) we rank the documents by semantic similarity to the query. Next, we use that ranking to train a deep neural network classifier to identify documents whose text is semantically similar to the query. Experiments demonstrate an overall accuracy of 98.075% and a positive accuracy of 88.178% have been achieved by this approach to semantic classification of multi-dialectal Arabic. The source code and the data set are provided on GitHub at https://github.com/therishel/ArabLeader
    • …