Back to all papers

Improving radiomics-based isocitrate dehydrogenase 1 prediction in glioma patients using semi-supervised machine learning models.

December 9, 2025pubmed logopapers

Authors

Ahmadzadeh AM,Jafarnezhad A,Elyassirad D,Vatanparast M,Gheiji B,Faghani S

Affiliations (5)

  • Department of Radiology, School of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran.
  • Shiraz University of Medical Sciences, Shiraz, Iran.
  • Student Research Committee, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran.
  • Radiology Informatics Lab, Department of Radiology, Mayo Clinic, 200 First St. SW, Rochester, 55905, USA. [email protected].
  • Department of Radiology, University of Pennsylvania, Philadelphia, PA, USA. [email protected].

Abstract

Determining isocitrate dehydrogenase (IDH) mutation status in glioma is important for determining prognosis. We aimed to compare supervised and semi-supervised machine learning (ML) models in glioma IDH1 mutation status prediction using magnetic resonance imaging (MRI)-derived radiomics features. Images and segmentation masks from several public collections, including ACRIN-FMISO, CPTAC-GBM, IvyGAP, TCGA-GBM, TCGA-LGG, UCSF-PDGM, UPENN-GBM, and REMBRANDT, were retrieved from The Cancer Imaging Archive (TCIA) portal. These data were divided into training cohort 1, unlabeled cohort, holdout internal validation (HOIV) cohort, and external validation (EV) cohort. After image preprocessing, radiomics features were extracted from T1-weighted, T1 contrast-enhanced (T1CE), T2-weighted, and fluid-attenuated inversion recovery (FLAIR) sequences. The least absolute shrinkage and selection operator (Lasso) algorithm was used for feature selection. Supervised and semi-supervised models were then constructed using 10 ML algorithms and various sequence combinations. For supervised models, we used training cohort 1 to develop the models. Regarding semi-supervised models, we initially predicted the labels of the unlabeled cohort using the training cohort 1 (pseudolabeling), then concatenated the training cohort 1 with these pseudolabeled data to create training cohort 2, and subsequently developed models using the training cohort 2. Both supervised and semi-supervised models were then validated on HOIV and EV cohorts. Data for 436, 151, 110, and 535 patients were included in the training cohort 1, unlabeled cohort, HOIV cohort, and EV cohort, respectively. A semi-supervised model using 24 features from T1CE images yielded the highest AUC on EV (0.951), which was significantly higher than the best supervised model (AUC = 0.917, p = 0.005). The latter model was constructed using 30 features from FLAIR and T1CE sequences. Furthermore, across all sequence combinations, the semi-supervised models consistently achieved higher AUCs in the EV cohort. Semi-supervised approaches may improve the performance of radiomics-based ML models in predicting glioma IDH1 status. Using pseudolabels, these models can increase the size of training data, potentially leading to enhancement of model predictive performance. Additionally, these models may improve prediction efficiency by requiring fewer image sequences.

Topics

Journal Article

Ready to Sharpen Your Edge?

Subscribe to join 7,100+ peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.