Comparative Analysis of the C5.0 Algorithm and Other Machine Learning Models for Early Detection of Multi-Class Heart Disease

Abstract

Cardiovascular diseases represent the leading cause of mortality worldwide, making accurate and early detection a critical factor for effective medical intervention and improved patient prognosis. While machine learning (ML) offers promising tools for predictive diagnostics, many existing studies rely on single-algorithm approaches or less-than-robust validation methods, thereby limiting the generalizability and real-world applicability of their findings.This study aims to conduct a rigorous, head-to-head comparative evaluation of multiple machine learning algorithms for the multi-class classification of heart disease, with the goal of identifying the most effective and reliable model for this complex clinical task.We utilized a private dataset comprising 300 patient medical records, each described by 11 clinically relevant features. To ensure a robust and unbiased evaluation, a stratified 5-fold cross-validation methodology was employed. Five widely-used classification algorithms were evaluated: Naïve Bayes (NB), Logistic Regression (LR), Random Forest (RF), a C5.0-analog Decision Tree (DT), and Support Vector Machine (SVM). Model performance was assessed using standard metrics, including accuracy, precision, recall, and F1-score.The comparative analysis revealed that the Naïve Bayes algorithm delivered superior performance, achieving the highest mean accuracy of 43.33% (±4.22%). It also led in other key metrics with a mean precision of 43.40%, recall of 43.64%, and an F1-score of 41.26%. Other algorithms, such as Logistic Regression (40.67% accuracy) and Random Forest (39.33% accuracy), demonstrated competitive performance but were ultimately surpassed by the Naïve Bayes model in this specific multi-class classification context.This research underscores the critical importance of employing robust validation techniques and comprehensive comparative analyses to identify optimal models for clinical applications. The Naïve Bayes algorithm emerges as a strong candidate for developing a reliable clinical decision support system for the early differentiation of various heart conditions, providing a foundation for future data-driven diagnostic tools

Similar works

This paper was published in Jurnal Politeknik Negeri Batam (PoliBatam).

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.

Licence: http://creativecommons.org/licenses/by-sa/4.0