Interpretable hybrid ensemble with attention-based fusion and EAOO-GA optimization for lung cancer detection.
Authors
Affiliations (4)
Affiliations (4)
- Department of Computer Science, College of Computer Engineering and Sciences, Prince Sattam bin Abdulaziz University, Al-Kharj, 16273, Saudi Arabia.
- Nursing Department, College of Applied Medical Sciences, Prince Sattam bin Abdulaziz University, Wade Aldwaser, Saudi Arabia.
- Physics Department, Al-Azhar University, Asyut, 71524, Egypt.
- Faculty of Computers and Information, Minia University, Minia, Egypt. [email protected].
Abstract
Lung cancer’s high mortality rate underscores the critical need for early and accurate diagnosis, as late-stage diagnoses often lead to 5-year survival rates as low as 5% compared to 56% for early detection, imposing significant economic burdens on healthcare systems and diminishing patient quality of life. While deep learning models offer promising tools for analyzing Computed Tomography (CT) scans, they often suffer from limitations in generalizability, interpretability, and sensitivity to imbalanced data. This paper introduces SE-FusionEAOO Ensemble, a new robust framework for lung cancer classification. Our approach leverages the strengths of multiple deep learning architectures through a sophisticated two-stage process. First, we construct three powerful feature fusion models by strategically pairing diverse pre-trained networks (DenseNet201/EfficientNetB6, Inception v3/MobileNetV2, DenseNet121/ResNet50), each integrated with Squeeze-and-Excitation (SE) blocks for adaptive feature recalibration. Second, we amalgamate the predictions of these expert models using an intelligently weighted aggregation scheme. The key innovation of our framework is the deployment of a new metaheuristic, the Enhanced Animated Oat Optimization algorithm with Genetic Operators (EAOO-GA), to precisely optimize these ensemble weights, ensuring optimal contribution from each model. To address class imbalance in the IQ-OTH/NCCD lung cancer dataset, we employ the Synthetic Minority Over-sampling Technique (SMOTE), significantly improving the model’s sensitivity to minority classes. Extensive experimental results demonstrate that our framework achieves a state-of-the-art accuracy of 99.40%, with 99.2% precision, 99.5% recall, and 99.3% F1-score, outperforming individual models, conventional ensemble methods, and other metaheuristic optimizers. Additionally, the model was externally validated on the LIDC-IDRI dataset, achieving 97.9% accuracy and 97.8% F1-score, confirming its strong generalization capability across independent clinical domains. The proposed framework provides a highly accurate, reliable, and interpretable tool for automated lung cancer detection.