1,241 research outputs found

    Clustering Arabic Tweets for Sentiment Analysis

    Get PDF
    The focus of this study is to evaluate the impact of linguistic preprocessing and similarity functions for clustering Arabic Twitter tweets. The experiments apply an optimized version of the standard K-Means algorithm to assign tweets into positive and negative categories. The results show that root-based stemming has a significant advantage over light stemming in all settings. The Averaged Kullback-Leibler Divergence similarity function clearly outperforms the Cosine, Pearson Correlation, Jaccard Coefficient and Euclidean functions. The combination of the Averaged Kullback-Leibler Divergence and root-based stemming achieved the highest purity of 0.764 while the second-best purity was 0.719. These results are of importance as it is contrary to normal-sized documents where, in many information retrieval applications, light stemming performs better than root-based stemming and the Cosine function is commonly used

    Clustering Arabic Tweets for Sentiment Analysis

    Get PDF
    The focus of this study is to evaluate the impact of linguistic preprocessing and similarity functions for clustering Arabic Twitter tweets. The experiments apply an optimized version of the standard K-Means algorithm to assign tweets into positive and negative categories. The results show that root-based stemming has a significant advantage over light stemming in all settings. The Averaged Kullback-Leibler Divergence similarity function clearly outperforms the Cosine, Pearson Correlation, Jaccard Coefficient and Euclidean functions. The combination of the Averaged Kullback-Leibler Divergence and root-based stemming achieved the highest purity of 0.764 while the second-best purity was 0.719. These results are of importance as it is contrary to normal-sized documents where, in many information retrieval applications, light stemming performs better than root-based stemming and the Cosine function is commonly used

    The Creation of an Arabic Emotion Ontology Based on E-Motive

    Get PDF
    © 2017 The Authors. Published by Elsevier B.V. There is an increased interest in social media monitoring to analyse massive, free form, short user-generated text from multiple social media sites such as Facebook, WhatsApp and Twitter. Companies are interested in sentiment analysis to understand customers\u27 opinions about their products/services. Governments and law enforcement agencies are interested in identifying threats to safeguard their country\u27s national security. They are actively seeking ways to monitor and analyse the public\u27s responses to various services, activities and events, especially since social media has become a valuable real-time resource of information. This study builds on prior work that focused on sentiment classification (i.e., positive, negative). This study primarily aims to design and develop a social sentiment-parsing algorithm for capturing and monitoring an extensive and comprehensive range of emotions from Arabic social media text. The study contributes to the field of sentiment analysis (opinion mining) and can subsequently be used for web mining, cleansing and analytics

    Classification Arabic Twitter User’s Insights Using Rough Set Theory

    Get PDF
    Nowadays, people using social media from around the world to share their daily affairs. Arabic twitter for example is a platform where users read, reply, post which known ‘tweets’. Users trading their opinions on different trends that are not equal in important and differed based on their power and interest. Tweets can provide rich information to make decision. The main objective of this paper is to present a framework for making a valuable decision through analyzing social users' insights based on their proximity to a particular trend with highlights their power in this trend. Tweets are exceedingly unstructured that makes it difficult to analyze. Nevertheless, our proposed model differs from previous research in this field it gathered the use of supervised and unsupervised machine learning algorithms. The process of performing this work as follows: classifying users based on the degree of their closeness/interest utilizing Mendelow’s power/interest matrix, rough set theory to eliminate the features that may be found in user profiles to find minimal sets of data. The proposed model applied two attribute reduction algorithms on our dataset to determine the optimal number of reducts for improving decision making from the user replies. In addition to, unsupervised machine learning to group their replies into subcategories such as positive, negative, or neutral. The experimental evaluation shows that Johnson algorithm has reduced the user attributes by 71% than genetic algorithm that utilized in a classification model
    • …
    corecore