Hybrid CNN–GRU–XGBoost framework for optimized coronary artery disease diagnosis and risk stratification
Abstract
Coronary artery disease (CAD) remains a leading driver of cardiovascular mortality, requiring diagnostic systems that deliver high discrimination, stability under class imbalance, and reproducible deployment. This work presents a hybrid pipeline that integrates convolutional encoders for localized feature interactions, gated recurrent units (GRUs) for conditional dependency modeling over ordered clinical attributes, and an extreme gradient-boosted tree classifier (XTree) for nonlinear decision refinement and feature-level attribution. The data pathway applies strict train-split isolation with source-aware quantile imputation, proximal denoising, robust trimmed standardization, stratified partitioning that preserves class and source distributions, and manifold-conformal minority augmentation to improve boundary coverage without leakage. Evaluation on the UCI Heart Disease cohort (Cleveland, Long Beach V, Switzerland, Hungary; p=14 attributes) used an 80/20 holdout and standard metrics. The proposed CNN–GRU–XTree attained 96.03% accuracy, 94.70% precision, 97.66% sensitivity, 96.17% F1, and 94.35% specificity. Relative to the strongest non-proposed baseline (CNN–LSTM–XTree: 95.63% accuracy, 94.70% precision, 96.90% sensitivity, 95.80% F1, 94.47% specificity), gains reached +0.40 percentage points (pp) in accuracy, +0.76 pp in sensitivity, and +0.37 pp in F1, with parity in precision and a negligible specificity delta (−0.12 pp). Against CNN–GRU–RF (95.24%/94.70%/96.15%/95.42%/94.18%), improvements were +0.79 pp accuracy, +1.51 pp sensitivity, +0.75 pp F1, and +0.17 pp specificity. Case-based simulations (600 min each) probed model behavior under clinically distinct conditions. In severe, unequivocal CAD, sensitivity remained ≥97% with specificity >93%; in low-risk asymptomatic profiles, specificity remained ≥95% with precision ≥93%; in borderline phenotypes with overlapping risk markers, F1 exceeded 94% while maintaining balanced error profiles.</p- info:eu-repo/semantics/article
- info:eu-repo/semantics/publishedVersion
- Clinical decision support
- Convolutional neural networks
- Coronary artery disease
- Deep learning
- Ensemble learning
- Extreme gradient boosting
- Feature extraction
- Gated recurrent units
- Medical diagnosis
- /dk/atira/pure/subjectarea/asjc/1300/1315; name=Structural Biology
- /dk/atira/pure/subjectarea/asjc/1300/1303; name=Biochemistry
- /dk/atira/pure/subjectarea/asjc/1600/1605; name=Organic Chemistry
- /dk/atira/pure/subjectarea/asjc/2600/2605; name=Computational Mathematics