2,700 research outputs found

    Sentiment analysis of comments in social media

    Get PDF
    Social media platforms are witnessing a significant growth in both size and purpose. One specific aspect of social media platforms is sentiment analysis, by which insights into the emotions and feelings of a person can be inferred from their posted text. Research related to sentiment analysis is acquiring substantial interest as it is a promising filed that can improve user experience and provide countless personalized services. Twitter is one of the most popular social media platforms, it has users from different regions with a variety of cultures and languages. It can thus provide valuable information for a diverse and large amount of data to be used to improve decision making. In this paper, the sentiment orientation of the textual features and emoji-based components is studied targeting “Tweets” and comments posted in Arabic on Twitter, during the 2018 world cup event. This study also measures the significance of analyzing texts including or excluding emojis. The data is obtained from thousands of extracted tweets, to find the results of sentiment analysis for texts and emojis separately. Results show that emojis support the sentiment orientation of the texts and that texts or emojis cannot separately provide reliable information as they complement each other to give the intended meaning

    Developing resources for sentiment analysis of informal Arabic text in social media

    Get PDF
    Natural Language Processing (NLP) applications such as text categorization, machine translation, sentiment analysis, etc., need annotated corpora and lexicons to check quality and performance. This paper describes the development of resources for sentiment analysis specifically for Arabic text in social media. A distinctive feature of the corpora and lexicons developed are that they are determined from informal Arabic that does not conform to grammatical or spelling standards. We refer to Arabic social media content of this sort as Dialectal Arabic (DA) - informal Arabic originating from and potentially mixing a range of different individual dialects. The paper describes the process adopted for developing corpora and sentiment lexicons for sentiment analysis within different social media and their resulting characteristics. The addition to providing useful NLP data sets for Dialectal Arabic the work also contributes to understanding the approach to developing corpora and lexicons

    Corpora for sentiment analysis of Arabic text in social media

    Get PDF
    Different Natural Language Processing (NLP) applications such as text categorization, machine translation, etc., need annotated corpora to check quality and performance. Similarly, sentiment analysis requires annotated corpora to test the performance of classifiers. Manual annotation performed by native speakers is used as a benchmark test to measure how accurate a classifier is. In this paper we summarise currently available Arabic corpora and describe work in progress to build, annotate, and use Arabic corpora consisting of Facebook (FB) posts. The distinctive nature of thesecorpora is that it is based on posts written in Dialectal Arabic (DA) not following specific grammatical or spelling standards. The corpora are annotated with five labels (positive, negative, dual, neutral, and spam). In addition to building the corpus, the paper illustrates how manual tagging can be used to extract opinionated words and phrases to be used in a lexicon-based classifier

    Arabic Dialect Texts Classification

    Get PDF
    This study investigates how to classify Arabic dialects in text by extracting features which show the differences between dialects. There has been a lack of research about classification of Arabic dialect texts, in comparison to English and some other languages, due to the lack of Arabic dialect text corpora in comparison with what is available for dialects of English and some other languages. What is more, there is an increasing use of Arabic dialects in social media, so this text is now considered quite appropriate as a medium of communication and as a source of a corpus. We collected tweets from Twitter, comments from Facebook and online newspapers from five groups of Arabic dialects: Gulf, Iraqi, Egyptian, Levantine, and North African. The research sought to: 1) create a dataset of Arabic dialect texts to use in training and testing the system of classification, 2) find appropriate features to classify Arabic dialects: lexical (word and multi-word-unit) and grammatical variation across dialects, 3) build a more sophisticated filter to extract features from Arabic-character written dialect text files. In this thesis, the first part describes the research motivation to show the reason for choosing the Arabic dialects as a research topic. The second part presents some background information about the Arabic language and its dialects, and the literature review shows previous research about this subject. The research methodology part shows the initial experiment to classify Arabic dialects. The results of this experiment showed the need to create an Arabic dialect text corpus, by exploring Twitter and online newspaper. The corpus used to train the ensemble classifier and to improve the accuracy of classification the corpus was extended by collecting tweets from Twitter based on the spatial coordinate points and comments from Facebook posts. The corpus was annotated with dialect labels and used in automatic dialect classification experiments. The last part of this thesis presents the results of classification, conclusions and future work

    Tourism Companies Assessment via Social Media Using Sentiment Analysis

    Get PDF
    ازدادت وسائل التواصل الاجتماعي بشكل كبير وواضح لانها وسيلة إعلام للمستخدمين للتعبير عن مشاعرهم من خلال آلاف المنشورات والتعليقات حول شركات السياحة. وبالتالي ، يصعب على السائح قراءة جميع التعليقات لتحديد ما إذا كانت تلك الآراء إيجابية أم سلبية لتقييم نجاح الشركة. في هذه البحث,تم استخدام التنقيب عن النص لتصنيف المشاعر من خلال جمع مراجعات اللهجة العراقية حول شركات السياحة من الفيس بوك لتحليلها باستخدام تحليل المشاعر لتتبع المشاعر الموجوده في المنشورات والتعليقات. ثم تم تصنيفها إلى تعليق إيجابي أو سلبي أو محايد باستخدام Naïve Bayes, Rough Set Theory , K-Nearest Neighbor. من بين 71 شركة سياحة عراقية وجدت أن 28٪ من هذه الشركات لديها تقييم جيد جدا ، و 26٪ من هذه الشركات لديها تقييم جيد ، و 31٪ من هذه الشركات لديها تقييم متوسط ​​، و 4٪ من هذه الشركات لديها تقييم مقبول و 11٪ من هذه الشركات لديها تقييم سيء. ساعدت النتائج التجريبية الشركات على تحسين عملها وبرامجها واستجابة كافية وسريعة لمتطلبات العملاءIn recent years, social media has been increasing widely and obviously as a media for users expressing their emotions and feelings through thousands of posts and comments related to tourism companies. As a consequence, it became difficult for tourists to read all the comments to determine whether these opinions are positive or negative to assess the success of a tourism company. In this paper, a modest model is proposed to assess e-tourism companies using Iraqi dialect reviews collected from Facebook. The reviews are analyzed using text mining techniques for sentiment classification. The generated sentiment words are classified into positive, negative and neutral comments by utilizing Rough Set Theory, Naïve Bayes and K-Nearest Neighbor methods. After experimental results, it was determined that out of 71 tested Iraqi tourism companies, 28% from these companies have very good assessment, 26% from these companies have good assessment, 31% from these companies have medium assessment, 4% from these companies have acceptance assessment and 11% from these companies have bad assessment. These results helped the companies to improve their work and programs responding sufficiently and quickly to customer demands
    corecore