22,055 research outputs found

    The Royal Birth of 2013: Analysing and Visualising Public Sentiment in the UK Using Twitter

    Full text link
    Analysis of information retrieved from microblogging services such as Twitter can provide valuable insight into public sentiment in a geographic region. This insight can be enriched by visualising information in its geographic context. Two underlying approaches for sentiment analysis are dictionary-based and machine learning. The former is popular for public sentiment analysis, and the latter has found limited use for aggregating public sentiment from Twitter data. The research presented in this paper aims to extend the machine learning approach for aggregating public sentiment. To this end, a framework for analysing and visualising public sentiment from a Twitter corpus is developed. A dictionary-based approach and a machine learning approach are implemented within the framework and compared using one UK case study, namely the royal birth of 2013. The case study validates the feasibility of the framework for analysis and rapid visualisation. One observation is that there is good correlation between the results produced by the popular dictionary-based approach and the machine learning approach when large volumes of tweets are analysed. However, for rapid analysis to be possible faster methods need to be developed using big data techniques and parallel methods.Comment: http://www.blessonv.com/research/publicsentiment/ 9 pages. Submitted to IEEE BigData 2013: Workshop on Big Humanities, October 201

    Comparison of Machine Learning Approaches on Arabic Twitter Sentiment Analysis

    Get PDF
    With the dramatic expansion of information over internet, users around the world express their opinion daily on the social network such as Facebook and Twitter. Large corporations nowadays invest on analyzing these opinions in order to assess their products or services by knowing the people feedback toward such business. The process of knowing users’ opinions toward particular product or services whether positive or negative is called sentiment analysis. Arabic is one of the common languages that have been addressed regarding sentiment analysis. In the literature, several approaches have been proposed for Arabic sentiment analysis and most of these approaches are using machine learning techniques. Machine learning techniques are various and have different performances. Therefore, in this study, we try to identifying a simple, but workable approach for Arabic sentiment analysis on Twitter. Hence, this study aims to investigate the machine learning technique in terms of Arabic sentiment analysis on Twitter. Three techniques have been used including Naïve Bayes, Decision Tree (DT) and Support Vector Machine (SVM). In addition, two simple sub-tasks pre-processing have been also used; Term Frequency-Inverse Document Frequency (TF-IDF) and Arabic stemming to get the heaviest weight term as the feature for tweet classification. TF-IDF aims to identify the most frequent words, whereas stemming aims to retrieve the stem of the word by removing the inflectional derivations. The dataset that has been used is Modern Arabic Corpus which consists of Arabic tweets. The performance of classification has been evaluated based on the information retrieval metrics precision, recall and f-measure. The experimental results have shown that DT has outperformed the other techniques by obtaining 78% of f-measure

    Machine Learning-Based Models for Assessing Impacts Before, During and After Hurricane Events

    Get PDF
    Social media provides an abundant amount of real-time information that can be used before, during, and after extreme weather events. Government officials, emergency managers, and other decision makers can use social media data for decision-making, preparation, and assistance. Machine learning-based models can be used to analyze data collected from social media. Social media data and cloud cover temperature as physical sensor data was analyzed in this study using machine learning techniques. Data was collected from Twitter regarding Hurricane Florence from September 11, 2018 through September 20, 2018 and Hurricane Michael from October 1, 2018 through October 18, 2018. Natural language processing models were developed to demonstrate sentiment among the data. Forecasting models for future events were developed for better emergency management during extreme weather events. Relationships among data were explored using social media data and physical sensor data to analyze extreme weather events as these events become more prevalent in our lives. In this study, social media sentiment analysis was performed that can be used by emergency managers, government officials, and decision makers. Different machine learning algorithms and natural language processing techniques were used to examine sentiment classification. The approach is multi-modal, which will help stakeholders develop a more comprehensive understanding of the social impacts of a storm and how to help prepare for future storms. Of all the classification algorithms used in this study to analyze sentiment, the naive Bayes classifier displayed the highest accuracy for this data. The results demonstrate that machine learning and natural language processing techniques, using Twitter data, are a practical method for sentiment analysis. The data can be used for correlation analysis between social sentiment and physical data and can be used by decision makers for better emergency management decisions

    Joint Distribution pada Weighted Majority Vote (WMV) untuk Peningkatan Kinerja Sentiment Analysis Tersupervisi pada Dataset Twitter

    Get PDF
    Sentiment analysis adalah teknik komputasi text mining berbasis natural language processing (NLP) untuk mengekstraksi pendapat seseorang yang diungkapkan dalam platform online, termasuk dalam platform microblogging Twitter, salah satu platform microblogging yang paling popular digunakan di Indonesia. Ada dua pendekatan yang umum digunakan dalam teknik sentiment analysis yaitu pendekatan berbasis machine learning (ML) dan pendekatan berbasis sentiment lexicon (SL). Fokus penelitian ini adalah untuk pengembangan teknik sentiment analysis berbasis machine learning yang disebut juga teknik tersupervisi pada dataset Twitter. Sebagian besar sentiment analysis pada dataset Twitter berbahasa Indonesia mengandalkan single machine learning algorithm. Penelitian ini menggabungkan kinerja berbagai algoritma/experts seraya mengurangi tingkat kesalahan klasifikasi dengan meng-update bobot secara dinamis menggunakan weighted majority vote (WMV) berbasis joint distribution dari Bayesian Network. Pada tahap pertama, data di grabbing dari Twitter dengan 3 hashtag terkait Covid-19 sebagai data eksperimen. Selanjutnya kinerja weighted majority vote secara ekstensif dibandingkan dengan 4 metode baseline sebagai pembanding, yaitu: Naïve Bayes, Gaussian Naïve Bayes, Multinomial Naïve Bayes dan Majority Vote dari ketiga single classifier tersebut. Metrics kinerja yang digunakan adalah precision, recall, fmeasure, accuracy dan Mathews correlation coeficient (MCCC). Dalam eksperimen, terbukti bahwa WMV mampu meningkatkan kinerja sentiment analysis pada ketiga topik dataset dengan evaluator berbagai metrics kinerja sentiment analysis. AbstractSentiment analysis is a computational text mining technique based on natural language processing (NLP) to extract someone's opinion expressed in online platforms, including the Twitter microblogging platform, one of the most popular microblogging platforms used in Indonesia. There are two approaches that are commonly used in sentiment analysis techniques, namely the machine learning (ML) based approach and the sentiment lexicon (SL) based approach. The focus of this research is the development of machine learning-based sentiment analysis techniques which are also called supervised techniques on the Twitter dataset. Most of the sentiment analysis on the Indonesian language Twitter dataset relies on a single machine learning algorithm. This study combines the performance of various algorithms/experts while reducing the level of misclassification by updating the weights dynamically using a joint distribution-based weighted majority vote (WMV) from the Bayesian Network. In the first stage, data was grabbed from Twitter with 3 hashtags related to Covid-19 as experimental data. Furthermore, the performance of the weighted majority vote was extensively compared with 4 baseline methods for comparison, namely: Naïve Bayes, Gaussian Naïve Bayes, Multinomial Nave Bayes and Majority Vote from the three single classifiers. Performance metrics used are precision, recall, fmeasure, accuracy and Mathews correlation coeficient. In experiments, it is proven that WMV is able to improve sentiment analysis performance on the three dataset topics with various evaluators of sentiment analysis performance metrics

    Sentiment Analysis of Microblogs Using Multilayer Feed-Forward Artificial Neural Networks

    Get PDF
    Sentiment analysis aims to extract public opinion on a particular topic and microblogs, especially Twitter as the most influential platform, represent a significant source of information. The application to microblogs has to cope with difficulties, such as informal language with abbreviations, internet jargons, emoticons, hashtags that do not appear in conventional text documents. Sentiment analysis technique for microblogs based on a feed-forward artificial neural network (ANN) with sigmoid activation function is proposed in this paper and compared to machine learning approaches, i.e. Multinomial Naive Bayes, Support Vector Machines and Maximum Entropy. Experiments were performed on Stanford Twitter Sentiment corpus, a balanced dataset which contains noisy training labels weakly annotated using emoticons as sentiment indicators; and SemEval-2014 Task 9 corpus, an unbalanced dataset which contains manually annotated training examples. The obtained results show that ANN produces superior or at least comparable results to state-of-the-art machine learning techniques

    Enhancing the Sentiment Classification Accuracy of Twitter Data using Machine Learning Algorithms

    Get PDF
    Sentiment analysis or opinion mining is the study of public opinions, sentiments, attitudes, and emotions expressed in social media. This is one of the most dynamic research areas in natural language processing and text mining in current years. It is a domain that involves the finding of user sentiment, emotion and opinion within natural language text. The growing significance of sentiment analysis coincides with the increase of social media such as reviews, forum discussions, blogs, micro-blogs, Twitter, and social networks. Common applications of sentiment analysis include the automatic determination of whether a review posted online (of a movie, a book, or a consumer product) is positive or negative toward the item being reviewed. This research work shows the various pathways to perform a computational treatment of sentiments and opinions. The main aim of this work is to classify the sentiment of twitter data using machine learning algorithms. The sentiment classifications have been classified into two types which are emotional classification and polarity classification. This work has been carried out on polarity classification, which is used to classify the text such as positive, negative, and neutral. The polarity classification is done by using the subjectivity lexicon. After the polarity classification two machine learning algorithms are employed to enhance the accuracy of sentiment classification. In the Pre-processing phase, the tweets are preprocessed by using various techniques. Sentiment classification is the essential phase, where preprocessed tweets are taken as input to sentiment classification. The sentiment classification can be done by using subjectivity lexicon. The third phase of the proposed work is to compare and evaluate the performance of two machine learning algorithms which are Support Vector Machine and Decision tre
    corecore