Application of Tuning-Ensemble N-Best in Auto-Sklearn for Mammographic Radiomic Analysis for Breast Cancer Prediction.
Authors
Affiliations (4)
Affiliations (4)
- School of Biology, Faculty of Applied Sciences, Universiti Teknologi MARA, Cawangan Negeri Sembilan, Kampus Kuala Pilah, 72000, Malaysia.
- Department of Physics, Faculty of Science, Universiti Putra Malaysia, 43400, UPM Serdang, Selangor, Malaysia.
- Institute for Mathematical Research (INSPEM), Universiti Putra Malaysia, 43400, UPM Serdang, SelangorMalaysia.
- Dubai Health, Radiology Department, Rashid Hospital, Dubai 00971, United Arab Emirates.
Abstract
Breast cancer is a major cause of mortality among women globally. While mammography remains the gold standard for detection, its interpretation is often limited by radiologist variability and the challenge of differentiating benign and malignant lesions. The study explores the use of Auto- Sklearn, an automated machine learning (AutoML) framework, for breast tumor classification based on mammographic radiomic features. 244 mammographic images were enhanced using Contrast Limited Adaptive Histogram Equalization (CLAHE) and segmented with Active Contour Method (ACM). Thirty-seven radiomic features, including first-order statistics, Gray-Level Co-occurance Matrix (GLCM) texture and shape features were extracted and standardized. Auto-Sklearn was employed to automate model selection, hyperparameter tuning and ensemble construction. The dataset was divided into 80% training and 20% testing set. The initial Auto-Sklearn model achieved an 88.71% accuracy on the training set and 55.10% on the testing sets. After the resampling strategy was applied, the accuracy for the training set and testing set increased to 95.26% and 76.16%, respectively. The Receiver Operating Curve and Area Under Curve (ROC-AUC) for the standard and resampling strategy of Auto-Sklearn were 0.660 and 0.840, outperforming conventional models, demonstrating its efficiency in automating radiomic classification tasks. The findings underscore Auto-Sklearn's ability to automate and enhance tumor classification performance using handcrafted radiomic features. Limitations include dataset size and absence of clinical metadata. This study highlights the application of Auto-Sklearn as a scalable, automated and clinically relevant tool for breast cancer classification using mammographic radiomics.