17 research outputs found

    Framework for sentiment analysis of Arabic text

    Get PDF

    Generating Javanese Stopwords List using K-means Clustering Algorithm

    Get PDF
    Stopword removal necessary in Information Retrieval. It can remove frequently appeared and general words to reduce memory storage. The algorithm eliminates each word that is precisely the same as the word in the stopword list. However, generating the list could be time-consuming. The words in a specific language and domain must be collected and validated by specialists. This research aims to develop a new way to generate a stop word list using the K-means Clustering method. The proposed approach groups words based on their frequency. The confusion matrix calculates the difference between the findings with a valid stopword list created by a Javanese linguist. The accuracy of the proposed method is 78.28% (K=7). The result shows that the generation of Javanese stopword lists using a clustering method is reliable

    A review on corpus annotation for arabic sentiment analysis

    Get PDF
    Mining publicly available data for meaning and value is an important research direction within social media analysis. To automatically analyze collected textual data, a manual effort is needed for a successful machine learning algorithm to effectively classify text. This pertains to annotating the text adding labels to each data entry. Arabic is one of the languages that are growing rapidly in the research of sentiment analysis, despite limited resources and scares annotated corpora. In this paper, we review the annotation process carried out by those papers. A total of 27 papers were reviewed between the years of 2010 and 2016

    Classification of Encouragement (Targhib) And Warning (Tarhib) Using Sentiment Analysis on Classical Arabic

    Get PDF
    The Holy Qur’an is the main religious text of Islam. The Qur’an has its own methods of Targhib (encouragement) and Tarhib (warning), which are important features of the Qur’an. Most of the Quranic verses would urge and encourage people to do right and good deeds, and also warn them from committing evil and bad deeds. The method of classifying a text into two opposing opinions has been applied previously in solving the problem of sentiment analysis. Currently, it is applied in identifying between Targhib (encouragement) and Tarhib (warning) verses in the Qur’an. Each verse of the Qur’an can be treated as either an encouragement, warning or neutral. The language of the Holy Qur’an is one of the most challenging natural languages in sentiment analysis.  The aim of this work is to classify the verses of encouragement and warning using sentiment analysis and NLP techniques. Several approaches are used in the Sentiment Analysis classification, such as the machine learning approach, the lexicon-based approach and the hybrid approach. In carrying out this aim, the applied machine learning approach was used, where the impact of the use of different techniques such as POS tagging, N-Gram and Feature selection with correlation based were evaluated and investigated. 95.6% accuracy was achieved using Naïve Bayes (NB) and 91.5% accuracy was achieved using the Support Vector Machines (SVM). This study is a significant study in extracting information and knowledge from the Holy Qur’an. It is significant for both researchers in the field of Islamic studies as well as non-specialized researchers

    Predicting STC Customers' Satisfaction Using Twitter

    Get PDF
    The telecom field has changed accordingly with the emergence of new technologies. This is the case with the telecom market in Saudi Arabia, which expanded in 2003 by attracting new investors. As a result, the Saudi telecom market became a viable market [1]. The prevalence of mobile voice service among the population in Saudi Arabia for that, this research aims at mining Arabic tweets to measure customer satisfaction toward Telecom company in Saudi Arabia. This research is a use case for the Saudi Telecom Company (STC) in Saudi Arabia. The contribution of this study will be capitalized as recommendations to the company, based on monitoring in real-time their customers' satisfaction on Twitter and from questionnaire analysis. It is the first work to evaluate customers' satisfaction with telecommunications (telecom) company in Saudi Arabia by using both social media mining and a quantitative method. It has been built by a corpus of Arabic tweets, using a Python script searching for real-time tweets that mention Telecom company using the hashtags to monitor the latest sentiments of Telecom customers continuously. The subset is 20,000 tweets that are randomly selected from the dataset, for training the machine- classifier. In addition, we have done the experimented using deep learning network. The results show that the satisfaction for each service ranges between 31.50% and 49.25%. One of the proposed recommendations is using 5G to solve the ``internet speed'' problem, which showed the lowest customer satisfaction, with 31.50%.This article's main contributions are defining the traceable measurable criteria for customer satisfaction with telecom companies in Saudi Arabia and providing telecom companies' recommendations based on monitoring real-time customers' satisfaction through Twitter

    Comparing Supervised Machine Learning Strategies and Linguistic Features to Search for Very Negative Opinions

    Get PDF
    In this paper, we examine the performance of several classifiers in the process of searching for very negative opinions. More precisely, we do an empirical study that analyzes the influence of three types of linguistic features (n-grams, word embeddings, and polarity lexicons) and their combinations when they are used to feed different supervised machine learning classifiers: Naive Bayes (NB), Decision Tree (DT), and Support Vector Machine (SVM). The experiments we have carried out show that SVM clearly outperforms NB and DT in all datasets by taking into account all features individually as well as their combinationsThis research was funded by project TelePares (MINECO, ref:FFI2014-51978-C2-1-R), and the Consellería de Cultura, Educación e Ordenación Universitaria (accreditation 2016-2019, ED431G/08) and the European Regional Development Fund (ERDF)S

    Adam Deep Learning with SOM for Human Sentiment Classification

    Get PDF
    Nowadays, with the improvement in communication through social network services, a massive amount of data is being generated from user's perceptions, emotions, posts, comments, reactions, etc., and extracting significant information from those massive data, like sentiment, has become one of the complex and convoluted tasks. On other hand, traditional Natural Language Processing (NLP) approaches are less feasible to be applied and therefore, this research work proposes an approach by integrating unsupervised machine learning (Self-Organizing Map), dimensionality reduction (Principal Component Analysis) and computational classification (Adam Deep Learning) to overcome the problem. Moreover, for further clarification, a comparative study between various well known approaches and the proposed approach was conducted. The proposed approach was also used in different sizes of social network data sets to verify its superior efficient and feasibility, mainly in the case of Big Data. Overall, the experiments and their analysis suggest that the proposed approach is very promissing

    Arabic Sentiment Analysis of Users’ Opinions of Governmental Mobile Applications

    Get PDF
    Different types of pandemics that have appeared from time to time have changed many aspects of daily life. Some governments encourage their citizens to use certain applications to help control the spread of disease and to deliver other services during lockdown. The Saudi government has launched several mobile apps to control the pandemic and have made these apps available through Google Play and the app store. A huge number of reviews are written daily by users to express their opinions, which include significant information to improve these applications. The manual processing and extracting of information from users’ reviews is an extremely difficult and time-consuming task. Therefore, the use of intelligent methods is necessary to analyse users’ reviews and extract issues that can help in improving these apps. This research aims to support the efforts made by the Saudi government for its citizens and residents by analysing the opinions of people in Saudi Arabia that can be found as reviews onGoogle Play and the app store using sentiment analysis and machine learning methods. To the best of our knowledge, this is the first study to explore users’ opinions about governmental apps in Saudi Arabia. The findings of this analysis will help government officers make the right decisions to improve the quality of the provided services and help application developers improve these applications by fixing potential issues that cannot be identified during application testing phases. A new dataset used for this research includes 8000 user reviews gathered from social media, Google Play and the app store. Different methods are applied to the dataset, and the results show that the k nearest neighbourhood (KNN) method generates the highest accuracy compared to other implemented methods