Diabetes Prediction Using The Smote-Cart Framework Model for Imbalanced Data Case

Noorizan, Farah Najidah; Jumadi, Nur Anida; Muhamad Amir Irfan Roslan; Ng, Li Mun; Manveer Pal Singh3; Yukihiro Ishida

Search results>Research output from Journals of Universiti Tun Hussein Onn Malaysia (UTHM)

research article

oai:publisher.uthm.edu.my:article/23279

Diabetes Prediction Using The Smote-Cart Framework Model for Imbalanced Data Case

Authors: Farah Najidah Noorizan
Nur Anida Jumadi
Muhamad Amir Irfan Roslan
Li Mun Ng
Manveer Pal Singh3
Yukihiro Ishida
Publication date: 10 February 2026
Publisher: 'Penerbit UTHM'

Abstract

Diabetes mellitus (DM) is described by chronic high blood glucose levels, which can result in long-term damage, dysfunction, and organ failure. As a result of technological advancements, many researchers are employing machine learning to predict diabetes. They collect patients’ demographics and health information, organizing them into a dataset. However, in most real-world data, the non-diabetic cases exceed the diabetic cases, contributing to bias in the majority class and resulting in low predictive diabetic cases. Therefore, a Synthetic Minority Oversampling Technique (SMOTE) has been proposed to improve diabetic prediction on the dataset samples before training the Classification and Regression Tree (CART) model. The proposed framework involved the preprocessing step (SMOTE and categorical conversion), CART training, hyperparameter tuning, and evaluation metrics. With a combination of 8 leaf numbers per node, a maximum of 10 splits, and deviance as the split criterion, the model achieves an overall accuracy of 98.72%, a precision of 98.94%, a sensitivity of 98.44%, and an F1-score of 98.67%. In conclusion, the proposed SMOTE-CART framework can effectively address the imbalanced data in a diabetes dataset and improve the accuracy of diabetes prediction

Similar works

Full text

Open in the Core reader

Download PDF

Journals of Universiti Tun Hussein Onn Malaysia (UTHM)

oai:publisher.uthm.edu.my:arti...

Last time updated on 11/02/2026

This paper was published in Journals of Universiti Tun Hussein Onn Malaysia (UTHM).

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.

Licence: https://creativecommons.org/licenses/by-nc-sa/4.0