106 research outputs found

    A Fine Tuned Universal Language Model Fine-Tuning (ULMFiT) Approach for Airline Twitter Sentiment Analysis

    Get PDF
    More researches have shown that real-time Twitter data can be used to predict market movement of securities and other financial instruments. The Universal Language Model Fine-Tuning (ULMFiT) is a new approach which is based on training a language model and transferring its knowledge to a final classifier. We propose to fine tune the ULMFiT model by optimizing the parameters and training the model in a deterministic approach to increase the reproducibility. In this paper, we performed multi-class classification using Fine-Tuned ULMFiT, Naive Bayes, SVM, Logistic Regression, Random Forest, Decision Tree, K-Nearest Neighbors, on the Twitter US Airline data set from Kaggle. A model is built firstly for six major U.S. airlines that performs sentiment analysis on customer reviews so that the airlines can have fast and concise feedback. Recommendations is made secondly on the most important aspect of services they could improve given customers complains. Significant accuracy has achieved, which shows that our models are reliable for future prediction. Also, the accuracy of different models is compared, and results show that Random Forest is the best approach

    AN APPROACH TO SENTIMENT ANALYSIS –THE CASE OF AIRLINE QUALITY RATING

    Get PDF
    Sentiment mining has been commonly associated with the analysis of a text string to determine whether a corpus is of a negative or positive opinion. Recently, sentiment mining has been extended to address problems such as distinguishing objective from subjective propositions, and determining the sources and topics of different opinions expressed in textual data sets such as web blogs, tweets, message board reviews, and news. Companies can leverage opinion polarity and sentiment topic recognition to gain a deeper understanding of the drivers and the overall scope of sentiments. These insights can advance competitive intelligence, improve customer service, attain better brand image, and enhance competitiveness. This research paper proposes a sentiment mining approach which detects sentiment polarity and sentiment topic from text. The approach includes a sentiment topic recognition model that is based on Correlated Topics Models (CTM) with Variational Expectation-Maximization (VEM) algorithm. We validate the effectiveness and efficiency of this model using airline data from Twitter. We also examine the reputation of three major airlines by computing their Airline Quality Rating (AQR) based on the output from our approach

    Machine Learning Based Twitter Sentiment Analysis and User Influence

    Get PDF
    The use of social media platforms, such as Twitter, has grown exponentially over the years, and it has become a valuable source of information for various fields, including marketing, politics, and finance. Sentiment analysis is particularly relevant  in social media analysis. Sentiment analysis involves the use of natural language processing (NLP) techniques to automatically determine the sentiment expressed in a given text, such as positive, negative, or neutral. In this research paper, we focus on Twitter sentiment analysis and identify the most influential users in a given topic. We propose a methodology based on machine learning techniques to perform sentiment analysis and identify the most influential users on Twitter based on popularity. Specifically, we utilize a combination of NLP techniques, sentiment lexicons, and machine learning algorithms to classify tweets as positive, negative, or neutral. We then employ popularity calculations for each user to identify the top 10 most influential users on a given topic. The proposed methodology was tested on a large dataset of US airlines tweets which is related to a specific topic i.e. airlines, and the results show that the approach can effectively classify tweets according to sentiment and identify the most influential users. We evaluated the performance of several machine learning algorithms, including Multinomial Naive Bayes, Support Vector Machines (SVM), Decision Trees, Gradient Boosting, logistic regression, AdaBoost, KNN and Random Forest, and found that the logistic regression algorithm has achieved the highest accuracy. The proposed methodology has several implications for various fields, such as marketing, where sentiment analysis can help companies understand consumer behavior and tailor their marketing strategies accordingly. Moreover, identifying the most influential users can provide insights into opinion leaders in a given topic and help companies and policymakers target their messages more effectively

    Sentiment Analysis of Tweets Before the 2024 Elections in Indonesia Using Bert Language Models

    Get PDF
    General election is one of the crucial moments for a democratic country, e.g., Indonesia. Good election preparation can increase people's participation in the general election. In this study, we conduct a sentiment analysis of Indonesian public opinion on the upcoming 2024 election using Twitter data and IndoBERT model. This study is aimed at helping the government and related institutions to understand public perception. Therefore, they could obtain valuable insights to better prepare for elections, including evaluating the election policies, developing campaign strategies, increasing voter engagement, addressing issues and conflicts, and increasing transparency and public trust. The main contribution of this study is threefold: (i) the application of state-of-the-art transformer-based model IndoBERT for sentiment analysis on political domain; (ii) the empirical evaluation of IndoBERT model against machine learning and lexicon-based models; and (iii) the new dataset creation for sentiment analysis in political domain. Our Twitter data shows that Indonesian public mostly reacts neutrally (83.7%) towards the upcoming 2024 election. Then, the experimental results demonstrate that IndoBERT large-p1 is the best-performing model that achieves an accuracy of 83.5%. It improves our baseline systems by 48.5% and 46.49% for TextBlob, 2.5% and 14.49% for Multinomial Naïve Bayes, and 3.5% and 13.49% for Support Vector Machine in terms of accuracy and F-1 score, respectively

    Sentiment Analysis Of Student Opinion Related To Online Learning Using Naïve Bayes Classifier Algorithm And SVM With Adaboost On Twitter Social Media

    Get PDF
    Twitter is one of the social media that functions to express opinions on issues or problems that are currently happening, such as problems in the social, economic, educational and other fields. One of the issues being discussed so far is online learning. The government has issued a policy, one of which is for all students to study at home online by using a network to be able to interact with each other like in the classroom. The government's reason for issuing this policy is to break the chain of the spread of the Covid-19 virus, which until now has not subsided. Regarding this online learning policy, there are pros and cons. This opinion is widely expressed on social media, one of which is Twitter. Sentiment analysis is a method for analyzing an opinion which aims to classify texts. The Naïve Bayes Classifier and Support Vector Machine methods are methods machine learning that can be used for sentiment analysis. The problem in classifying text is that the resulting accuracy is less than optimal, so feature selection or boosting is needed to improve its accuracy. In this study, optimization of boosting was carried out using Adaboost. The purpose of this study is to compare the performance of the algorithm before and after using Adaboost. The results of the sentiment analysis on online learning obtained the highest accuracy results by the Naïve Bayes Classifier algorithm coupled with Adaboost of 99.26%, with a precision of 99.39% and recall of 99.20%

    Analysis of the Impact of Vectorization Methods on Machine Learning-Based Sentiment Analysis of Tweets Regarding Readiness for Offline Learning

    Get PDF
    Twitter users use social media to express emotions about something, whether it is criticism or praise. Analyzing the opinions or sentiments in the tweets that Twitter users send can identify their emotions for a particular topic. This study aims to determine the impact of vectorization methods on public sentiment analysis regarding the readiness for offline learning in Indonesia during the Covid-19 pandemic. The authors labeled sentiment using two different approaches: manually and automatically using the NLP TextBlob library. We compared the vectorization method used by employing count vectorization, TF-IDF, and a combination of both. The feature vectors were then classified using three classification methods: naïve Bayes, logistic regression, and k-nearest neighbor, for both manual and automatic labeling. To assess the performance of sentiment analysis models, we used accuracy, precision, recall, and F1-score for performance metrics. The best results showed that the Logistic regression classifier with the feature extraction technique that combines count vectorization and TF-IDF provided the best performance for both data with manual and automatic labeling

    SENTIMENT ANALYSIS ON TWITTER BY USING MAXIMUM ENTROPY AND SUPPORT VECTOR MACHINE METHOD

    Get PDF
    With the advancement of social media and its growth, there is a lot of data that can be presented for research in social mining. Twitter is a microblogging that can be used. In this event, a lot of companies used the data on Twitter to analyze the satisfaction of their customer about product quality. On the other hand, a lot of users use social media to express their daily emotions. The case can be developed into a research study that can be used both to improve product quality, as well as to analyze the opinion on certain events. The research is often called sentiment analysis or opinion mining. While The previous research does a particularly useful feature for sentiment analysis, but it is still a lack of performance. Furthermore, they used Support Vector Machine as a classification method. On the other hand, most researchers found another classification method, which is considered more efficient such as Maximum Entropy. So, this research used two types of a dataset, the general opinion data, and the airline's opinion data. For feature extraction, we employ four feature extraction, such as pragmatic, lexical-grams, pos-grams, and sentiment lexical. For the classification, we use both of Support Vector Machine and Maximum Entropy to find the best result. In the end, the best result is performed by Maximum Entropy with 85,8% accuracy on general opinion data, and 92,6% accuracy on airlines opinion data

    Tweet-based Target Market Classification Using Ensemble Method

    Get PDF
    Target market classification is aimed at focusing marketing activities on the right targets. Classification of target markets can be done through data mining and by utilizing data from social media, e.g. Twitter. The end result of data mining are learning models that can classify new data. Ensemble methods can improve the accuracy of the models and therefore provide better results. In this study, classification of target markets was conducted on a dataset of 3000 tweets in order to extract features. Classification models were constructed to manipulate the training data using two ensemble methods (bagging and boosting). To investigate the effectiveness of the ensemble methods, this study used the CART (classification and regression tree) algorithm for comparison. Three categories of consumer goods (computers, mobile phones and cameras) and three categories of sentiments (positive, negative and neutral) were classified towards three target-market categories. Machine learning was performed using Weka 3.6.9. The results of the test data showed that the bagging method improved the accuracy of CART with 1.9% (to 85.20%). On the other hand, for sentiment classification, the ensemble methods were not successful in increasing the accuracy of CART. The results of this study may be taken into consideration by companies who approach their customers through social media, especially Twitter

    CLASSIFICATION METHODS ON SENTIMENT ANALYSIS OF TOURISTS ON AIRLINES IN TWITTER

    Get PDF
    Sentiment analysis is one of the knowledge to find the opinions of society towards a topic of discussion particular. Text mining is the science that many performed by individuals or companies to improve performance and fix complaints public against the services or brand trademarks that exist in the world of business. One of them is business flight or airline flights. One of them is public complaints against certain airlines posted on twitter. It is certainly going to greatly affect the airline 's own because , media social is one of the means of advertising and trade are extensive. Machine learning methods such as Logistics Regression, Kneighbors Classifier, Support Vector Classifier (SVC), Decision Tree Classifier, Random Forest Classifier, and Gaussian. Several classification methods are used to compare the performance of each method to see the best results
    • …
    corecore