Back to all papers

Prediction of Neoadjuvant Chemotherapy Efficacy for Locally Advanced Nasopharyngeal Carcinoma Using MRI-Based Deep Learning Features Combined with Vision Transformer.

March 2, 2026pubmed logopapers

Authors

Yang Y,Mu X,Liu L,Jin G

Affiliations (3)

  • Department of Radiology, The Guangxi Medical University Cancer Hospital, Nanning, Guangxi, China (Y.Y., L.L., G.J.).
  • Department of Nuclear Medicine, The First Affiliated Hospital of Guilin Medical University, Guilin, Guangxi, China (X.M.).
  • Department of Radiology, The Guangxi Medical University Cancer Hospital, Nanning, Guangxi, China (Y.Y., L.L., G.J.). Electronic address: [email protected].

Abstract

Predicting neoadjuvant chemotherapy (NACT) efficacy is vital for advanced nasopharyngeal carcinoma (LA-NPC) management. Existing models have limited generalizability. The combination of MRI-based deep learning features (DLF) and Vision Transformer (ViT) for this purpose remains unexplored. This study therefore aims to evaluate the value of multi-sequence MRI-based DLF combined with ViT for predicting NACT efficacy in LA-NPC. This study retrospectively enrolled 266 LA-NPC patients receiving standard NACT, categorized by RECIST 1.1 into CR and non-CR groups, and split into training and testing sets (3:1). Traditional radiomics and 2D/2.5D/3D deep learning models were built and compared. Select the optimal architecture as the feature extractor, features were reduced via PCA and input into ViT. The XGBoost model on T2-FS sequences performed best among traditional radiomics models, with a validation AUC of 0.760. For deep learning models, performance improved with model complexity: 2D models were least effective (AUC: 0.502-0.653), followed by 2.5D (best AUC: 0.713), while 3D models were optimal (best AUC: 0.755). Ultimately, we integrated the deep learning features extracted from the two optimal single models (2.5D-ResNet50 and 3D-DenseNet121) and input them into the ViT architecture for global context modeling. This fused model achieved superior performance, with a validation AUC of 0.926, accuracy of 0.903, and F1-score of 0.927, significantly outperforming all previous models (all P < 0.05). The integrated model combining multi-sequence MRI DLF with ViT significantly enhances predictive performance for NACT efficacy in LA-NPC.

Topics

Journal Article

Ready to Sharpen Your Edge?

Subscribe to join 11k+ peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.