Medical image pretraining-based transfer learning for generalizable and robust diagnosis of bone tumors on radiographs: a multi-center study.
Authors
Affiliations (9)
Affiliations (9)
- Department of Interventional Radiology, The First Affiliated Hospital of Soochow University, Suzhou, China.
- Department of Radiology, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China.
- Clinical Neuroscience Center, Affiliated Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China.
- Department of Radiology, Renji Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China.
- Department of Radiology, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China. [email protected].
- Clinical Neuroscience Center, Affiliated Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China. [email protected].
- Department of Radiology, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China. [email protected].
- Faculty of Medical Imaging Technology, College of Health Science and Technology, Shanghai Jiao Tong University School of Medicine, Shanghai, China. [email protected].
- Department of Interventional Radiology, The First Affiliated Hospital of Soochow University, Suzhou, China. [email protected].
Abstract
To develop a generalizable and robust deep learning model for bone tumor classification in radiographs by leveraging domain-specific medical image pretraining. This retrospective multi-center study included 2338 patients with histopathologically confirmed bone tumors from four centers. Four hundred seventy-one patients from one center were used for model development, and 1867 patients from the other three centers were used for the external test. Deep learning models (ResNet50 and InceptionV3) were developed using transfer learning with weights from either RadImageNet (medical images) or ImageNet (natural images). A radiomics model based on ElasticNet was also built. Model performance was evaluated using the area under the curve (AUC), and the paired DeLong test was used to evaluate statistical significance between AUCs. Robustness was assessed through tumor bounding box perturbation experiments. Gradient-weighted class activation mapping (Grad-CAM) was performed to localize the key area highlighted by the model for enhancing interpretability. ResNet50 pretrained on RadImageNet demonstrated improved performance on external test sets (AUC = 0.738, 95% CI: 0.714-0.762), outperforming ImageNet-pretrained models (ResNet50: AUC = 0.669, 95% CI: 0.639-0.699, p < 0.001; InceptionV3: AUC = 0.677, 95% CI: 0.647-0.708, p < 0.001) and the radiomics model (AUC = 0.518, 95% CI: 0.487-0.548, p < 0.001). RadImageNet-pretrained models showed higher stability under tumor bounding box perturbation conditions (p < 0.001), and appropriately focused on diagnostically relevant regions in correctly classified cases. The deep learning model pretrained on domain-specific medical images demonstrated improved performance and robustness compared to the radiomics and natural image-pretrained models for bone tumor classification on radiographs. Domain-specific medical image pretraining enhanced deep learning model performance and robustness over radiomics and natural image approaches in bone tumor classification on radiographs. Domain-specific medical image pretraining (RadImageNet) significantly outperforms natural image pretraining (ImageNet) for bone tumor classification on radiographs. Deep learning models pretrained on medical images demonstrate superior performance compared to radiomics approaches for bone tumor classification. AI assistance effectiveness varies among radiologists, with performance improvements depending on individual experience and receptiveness to AI support.