2 research outputs found

    Sentiment Classification Using a Sense Enriched Lexicon-based Approach

    Get PDF
    The prominent approach in sentiment polarity classification is the Lexicon-based approach which relies on a dictionary to assign a score to subjective words. Most of the existing work use score of the most dominant sense in this process instead of using the contextually appropriate sense. The use of Word Sense Disambiguation (WSD) is less investigated in the sentiment classification tasks. This paper investigates the effect of integrating WSD into a Lexicon-based approach for Sentiment Polarity classification and compares it with the existing Lexicon-based approaches and the state-of-art supervised approaches. The lexicon used in this work is SentiWordNet v2.0. The proposed approach, called Sense Enriched Lexicon-based Approach (SELSA), uses a word sense disambiguation module to identify the correct sense of subjective words. Instead of using the score of the most frequent sense, it uses the score of the contextually appropriate sense only. For the purpose of comparison with the supervised approaches, the authors investigate NaĂŻve Bayes (NB) and Support Vector Machines (SVM) classifiers which tend to perform better in earlier research. The performance of these classifiers is evaluated using Word2vec, Hashing Vectorizer, and bi-gram feature. The best-performing classifier-feature combination is used for comparison. All the evaluations are done on the Movie Review dataset. SELSA achieves an accuracy of 96.25% which is significantly better than the accuracy obtained by SentiWordNet-based approach without WSD on the same dataset. The performance of the proposed algorithm is also compared with the best-performing supervised classifier investigated in this work and earlier reported works on the same dataset. The results reveal that the SVM classifier performs better than SentiWordNet approach without WSD. However, after incorporating WSD the performance of the proposed Lexicon-based approach is significantly improved and it surpasses the best-performing supervised classifier (SVM with bi-gram features)

    Exploring Public Sentiment: A Sentiment Analysis of GST Discourse on Twitter using Supervised Machine Learning Classifiers

    Get PDF
    A key economic move that resulted in heated disputes was India's introduction of the Goods and Services Tax (GST). Social media channels offered a widely used forum for the people to express their views on the GST, providing insightful data for gauging mood and guiding next revisions. The emotion of 5629 GST-related tweets was assessed using the VADER lexicon after being obtained using the Twitter Developer API. The tf-idf feature was used for text vectorization, with 80% of the data going toward training and the remaining 20% going toward testing. In this study, six well-known classifiers—the Ridge Classifier, Logistic Regression, Linear SVC, Perceptron, Decision Tree, and K-Nearest Neighbor—were thoroughly compared to evaluate their performance in a range of circumstances. Accuracy, precision, recall, f-score, training, and testing times were all included in the performance measurements. The study presented novel pre-processing methods and examined the training/testing times before coming to the conclusion that the Ridge Classifier performed better than the others in terms of accuracy, precision, and efficiency. In this study, six well-known classifiers—the Ridge Classifier, Logistic Regression, Linear SVC, Perceptron, Decision Tree, and K-Nearest Neighbor—were thoroughly compared to evaluate their performance in a range of circumstances. Accuracy, precision, recall, f-score, training, and testing times were all included in the performance measurements. The study presented novel pre-processing methods and examined the training/testing times before coming to the conclusion that the Ridge Classifier performed better than the others in terms of accuracy, precision, and efficiency
    corecore