Improving Breast Cancer Classification with Adaptive Synthetic Sampling, Feature Selection, and Hyperparameter Optimization

Abstract

Breast cancer is a major global health concern, highlighting the need for accurate and efficient diagnostic solutions rather than persistent issues with detection accuracy. This study presents an enhanced machine learning framework to improve breast cancer classification by addressing key limitations: Class imbalance, irrelevant features, and suboptimal hyperparameters. Adaptive synthetic sampling (ADASYN) was used to balance class distribution and various feature selection techniques. Univariate Selection and recursive feature elimination improved feature relevance, and arctic puffin optimization (APO) was applied for hyperparameter tuning. Multiple classifiers were evaluated using the Wisconsin Diagnostic Breast Cancer dataset. The random forest (RF) with ADASYN approach, optimized using APO, achieved outstanding results – 99.53% accuracy, 100% precision, 99.07% recall, and 99.53% F1-score – with only one misclassification out of 569 samples. This framework, while not modifying ADASYN or RF algorithms themselves, significantly enhances diagnostic performance and serves as a robust foundation for clinical decision support systems

Similar works

Full text

ARO-THE SCIENTIFIC JOURNAL OF KOYA UNIVERSITY

redirect
Last time updated on 19/04/2026

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.

Licence: https://creativecommons.org/licenses/by-nc-sa/4.0