14,719 research outputs found

    A Bayesian-Based Approach for Public Sentiment Modeling

    Full text link
    Public sentiment is a direct public-centric indicator for the success of effective action planning. Despite its importance, systematic modeling of public sentiment remains untapped in previous studies. This research aims to develop a Bayesian-based approach for quantitative public sentiment modeling, which is capable of incorporating uncertainty and guiding the selection of public sentiment measures. This study comprises three steps: (1) quantifying prior sentiment information and new sentiment observations with Dirichlet distribution and multinomial distribution respectively; (2) deriving the posterior distribution of sentiment probabilities through incorporating the Dirichlet distribution and multinomial distribution via Bayesian inference; and (3) measuring public sentiment through aggregating sampled sets of sentiment probabilities with an application-based measure. A case study on Hurricane Harvey is provided to demonstrate the feasibility and applicability of the proposed approach. The developed approach also has the potential to be generalized to model various types of probability-based measures

    The Development of a Temporal Information Dictionary for Social Media Analytics

    Get PDF
    Dictionaries have been used to analyze text even before the emergence of social media and the use of dictionaries for sentiment analysis there. While dictionaries have been used to understand the tonality of text, so far it has not been possible to automatically detect if the tonality refers to the present, past, or future. In this research, we develop a dictionary containing time-indicating words in a wordlist (T-wordlist). To test how the dictionary performs, we apply our T-wordlist on different disaster related social media datasets. Subsequently we will validate the wordlist and results by a manual content analysis. So far, in this research-in-progress, we were able to develop a first dictionary and will also provide some initial insight into the performance of our wordlist

    Using Machine Learning in Disaster Tweets Classification

    Get PDF
    People share real-time updates on social media platforms (i.e. Twitter) when disaster occurs, this information is very valuable for disaster relief and response teams as it can alert them immediately in order to prioritize tasks. Text mining and Machine learning algorithm can scan the huge generated unstructured data on social media platforms such as Twitter, to spot such information through keywords and phrases that refers to disasters. One challenge that the algorithm might face is whether a tweet text is talking about a real disaster or uses those keywords as a metaphor, which can lead to huge mislabeling of tweets. Hence, this research aims on using Natural Language Processing (NLP) and classification models to distinguish between real and fake disaster tweets. The dataset was acquired from Kaggle website, and it contain tweets that are related to real disasters, and other tweets that refers to fake disasters. Furthermore, using RStudio software, exploratory data analysis (EDA), feature selections, and data cleaning were performed prior to the data modeling, two different training to testing split were tested. In addition, four classifiers were built, which are SVM, KNN, Naïve Bayes, and XGBoost. As a result, the best accuracies achieved with 80/20 ratio split, and with using the whole dataset rather than sampling, SVM and XGBoost performed well with accuracies of 80% and 78% respectively, while KNN suffered overfitting (99% accuracy) and Naïve Bayes performed poorly (65%)
    corecore