Text classification of traditional and national songs using naïve bayes algorithm

Ismail, Amelia Ritahani; Simbolon, Triyanti; Wibawa, Aji Prasetya; Zaeni, Ilham Ari Elbaith

Text classification of traditional and national songs using naïve bayes algorithm

Authors: Amelia Ritahani Ismail
Triyanti Simbolon
Aji Prasetya Wibawa
Ilham Ari Elbaith Zaeni
Publication date: 30 November 2022
Publisher: Association for Scientific Computing Electronics and Engineering (ASCEE)
Doi

Abstract

In this research, we investigate the effectiveness of the multinomial Naïve Bayes algorithm in the context of text classification, with a particular focus on distinguishing between folk songs and national songs. The rationale for choosing the Naïve Bayes method lies in its unique ability to evaluate word frequencies not only within individual documents but across the entire dataset, leading to significant improvements in accuracy and stability. Our dataset includes 480 folk songs and 90 national songs, categorized into six distinct scenarios, encompassing two, four, and 31 labels, with and without the application of Synthetic Minority Over-sampling Technique (SMOTE). The research journey involves several essential stages, beginning with pre-processing tasks such as case folding, punctuation removal, tokenization, and TF-IDF transformation. Subsequently, the text classification is executed using the multinomial Naïve Bayes algorithm, followed by rigorous testing through k-fold cross-validation and SMOTE resampling techniques. Notably, our findings reveal that the most favorable scenario unfolds when SMOTE is applied to two labels, resulting in a remarkable accuracy rate of 93.75%. These findings underscore the prowess of the multinomial Naïve Bayes algorithm in effectively classifying small data label categories

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

Association for Scientic Computing Electronics and Engineering (ASCEE): Open Journal Systems

oai:ojs.pubs2.ascee.org:articl...

Last time updated on 21/11/2023