An Efficient Ensemble Method Using K-Fold Cross Validation for the Early Detection of Benign and Malignant Breast Cancer

Abstract

In comparison to all other malignancies, breast cancer is the most common form of cancer, among women. Breast cancer prediction has been studied by several researchers and is considered a serious threat to women. Clinicians are finding it difficult to create a treatment approach that will help patients live longer, due to the lack of solid predictive models. Rates of this malignancy have been observed to rise, more with industrialization and urbanization, as well as with early detection facilities. It is still considerably more prevalent in very developed countries, but it is rapidly spreading to developing countries as well. The purpose of this work is to offer a report on the disease of breast cancer in which we used available technical breakthroughs to construct breast cancer survivability prediction models. The Machine Learning (ML) techniques, namely Support Vector Machine (SVM), K-Nearest Neighbors (KNN), Decision Tree (DT) Classifier, Random Forests (RF), and Logistic Regression (LR) is used as base Learners and their performance has been compared with the ensemble method, eXtreme Gradient Boosting (XGBoost).  For performance comparison, we employed the k-fold cross-validation method to measure the unbiased estimate of these prediction models. The results indicated that XGBoost outperformed with an accuracy of 97.81% compared to other ML algorithms

    Similar works