Mutee AF, Al-Hussainy AF, Ibrahim NA, Roopashree R, Chanania K, Karavadi B, Sharma V, Sinha A, Khamidov O, Sameer HN, Salih RM, Adil M, Farhood B
To develop and evaluate a comprehensive AI-driven pipeline for automated segmentation and multi-class classification of ovarian tumors in ultrasound images using advanced transformer-based models and radiomic analysis. The study was a multi-center, retrospective study that included ultrasound data of 1,907 patients in 4 clinical centers, with 1,364 patients being used as a model training and validation and 543 as an external test. Five transformer-based segmentation models were used to outline tumour regions namely SegFormer, Swin-UNet, DPT, nnU-Net and HRFormer. Dice Similarity Coefficient (DSC), Hausdorff Distance (HD) and Relative Absolute Volume Difference (RAVD) were used to measure segmentation accuracy. Radiomic features (n = 215) were taken out through SERA platform, following the IBSI instructions. Mutual Information (MI), LASSO and Recursive Feature Elimination (RFE) were used as feature selection methods. Five models were considered to classify them: TabTransformer, TabNet, MLP, XGBoost, and KAN. Three clinically significant classification tasks, including tumour type, histological subtype, and FIGO staging, were covered. The models were checked by five-fold cross-validation and externally tested. SHAP analysis delivered interpretability. SegFormer was found to be the most accurate in terms of segmentation (mean DSC > 91%), which is higher than other U-Net-based models in earlier research. To classify, the TabTransformer selected by MI had the highest test accuracies in all tasks: tumour type (92.7 per cent.), histological subtype (92.1 per cent.), and FIGO stage (92.0 per cent). The accuracy of the external tests was over 90 per cent in all the tasks. SHAP analysis demonstrated the important radiomic features that make classification decisions, which is clinically transparent. There was no difference in performance between centers and imaging conditions. The suggested pipeline proves to be very accurate, general and interpretable in automated ovarian cancer diagnostics based on ultrasound imaging. It provides a powerful framework of clinical implementation and future application with multimodal data to support decisions better.