A Hybrid Multi-CNN Feature Fusion and LASSO Optimization Approach for High-Performance Breast Cancer Classification.
Authors
Affiliations (4)
Affiliations (4)
- Department of Electronics, Faculty of Technology, University of M'sila University Pole, Road Bordj Bou Arreridj, M'sila, 28000, Algeria. [email protected].
- Department of Electronics, Faculty of Technology, University of M'sila University Pole, Road Bordj Bou Arreridj, M'sila, 28000, Algeria.
- Electrical Electronics & Communications Engineering Department, Istanbul Technical University (ITU), Istanbul, Turkey.
- School of Cybersecurity, Astana IT University, 010000, Astana, Kazakhstan.
Abstract
Breast cancer remains a leading cause of mortality among women worldwide, highlighting the urgent need for accurate and early detection methods to improve survival rates. Although deep learning, particularly transfer learning with pretrained CNNs, has shown promise in medical image analysis, relying on a single model may limit the depth of learned representations. We present a hybrid approach that leverages the complementary strengths of three well-established CNNs, such as MobileNetV2, DenseNet121, and InceptionV3 for robust breast cancer diagnosis. Our methodology involves extracting pertinent features from the breast ultrasound images (BUSI) dataset containing 780 images using each model individually, followed by the fusion and transformation of these features into a high-dimensional combined vector. This fusion strategy effectively integrates diverse learned knowledge, thereby mitigating the risk of overfitting. Subsequently, we employ LASSO-driven feature selection to identify a reduced set of the most informative features within the fused representation, which can enhance model interpretability and potentially improve generalizability. The dataset was systematically divided using a five-fold cross-validation mechanism and reported as mean ± standard deviation to support reliable model training and evaluation. The proposed method achieved promising results, with an accuracy of 99.23 ± 0.49%, sensitivity of 98.57 ± 1.90%, specificity of 99.54 ± 0.56%, precision of 99.07 ± 1.14%, F1-score of 98.80 ± 0.77%, and AUC of 0.9985 ± 0.0019, outperforming state-of-the-art techniques on the evaluated dataset. Future work will focus on validation using larger, multicenter datasets.