Advanced Multi-architecture Deep Learning Framework for BIRADS-Based Mammographic Image Retrieval: Comprehensive Performance Analysis with Super-Ensemble Optimization.
Authors
Affiliations (4)
Affiliations (4)
- Department of Computer Science, Universiti Sains Malaysia, Penang, Malaysia. [email protected].
- Department of Biomedicine, School of Dental Sciences, Universiti Sains Malaysia, Kelantan, Malaysia.
- Baruch College, The City University of New York, New York, USA.
- Faculty of Mechatronics Engineering, International Islamic University Malaysia, Kuala Lumpur, Malaysia.
Abstract
Content-based mammographic image retrieval requires exact BIRADS categorical matching across five classes, posing far greater complexity than conventional binary classification. Existing studies remain limited by small sample sizes, improper patient-level separation, and inadequate statistical validation, restricting clinical translation. We developed a comprehensive evaluation framework systematically comparing CNN architectures (DenseNet121, ResNet50, VGG16) under advanced training strategies: fine-tuning, metric learning, and super-ensemble optimization. Rigorous patient-stratified splits (1003 patients, two images each), 602 test queries, and bootstrap confidence intervals (1000 resamples) ensured reliable assessment. Advanced fine-tuning and test-time augmentation (TTA) yielded a precision@10 of 34.71% for DenseNet121 _AdvancedFT_TTA, a 25.74% improvement over the baseline ResNet50 (27.6%). Selective super-ensemble and metric learning approaches were further benchmarked under patient-exclusive splits, confirming robust performance across architectures. Statistical analysis (bootstrap CIs, n = 1000; t-tests p < 0.001; Cohen's d > 0.8) validated significant gains and reproducibility. These results establish DenseNet121_AdvancedFT_TTA as the new state-of-the-art for five-class BIRADS retrieval while reducing computational cost.