Prediction of Neoadjuvant Chemotherapy Efficacy for Locally Advanced Nasopharyngeal Carcinoma Using MRI-Based Deep Learning Features Combined with Vision Transformer.

March 2, 2026

papers

DOI: 10.1016/j.acra.2026.02.003 PMID: 41775615

Authors

Yang Y,Mu X,Liu L,Jin G

Affiliations (3)

Department of Radiology, The Guangxi Medical University Cancer Hospital, Nanning, Guangxi, China (Y.Y., L.L., G.J.).
Department of Nuclear Medicine, The First Affiliated Hospital of Guilin Medical University, Guilin, Guangxi, China (X.M.).
Department of Radiology, The Guangxi Medical University Cancer Hospital, Nanning, Guangxi, China (Y.Y., L.L., G.J.). Electronic address: [email protected].

Abstract

Predicting neoadjuvant chemotherapy (NACT) efficacy is vital for advanced nasopharyngeal carcinoma (LA-NPC) management. Existing models have limited generalizability. The combination of MRI-based deep learning features (DLF) and Vision Transformer (ViT) for this purpose remains unexplored. This study therefore aims to evaluate the value of multi-sequence MRI-based DLF combined with ViT for predicting NACT efficacy in LA-NPC. This study retrospectively enrolled 266 LA-NPC patients receiving standard NACT, categorized by RECIST 1.1 into CR and non-CR groups, and split into training and testing sets (3:1). Traditional radiomics and 2D/2.5D/3D deep learning models were built and compared. Select the optimal architecture as the feature extractor, features were reduced via PCA and input into ViT. The XGBoost model on T2-FS sequences performed best among traditional radiomics models, with a validation AUC of 0.760. For deep learning models, performance improved with model complexity: 2D models were least effective (AUC: 0.502-0.653), followed by 2.5D (best AUC: 0.713), while 3D models were optimal (best AUC: 0.755). Ultimately, we integrated the deep learning features extracted from the two optimal single models (2.5D-ResNet50 and 3D-DenseNet121) and input them into the ViT architecture for global context modeling. This fused model achieved superior performance, with a validation AUC of 0.926, accuracy of 0.903, and F1-score of 0.927, significantly outperforming all previous models (all P < 0.05). The integrated model combining multi-sequence MRI DLF with ViT significantly enhances predictive performance for NACT efficacy in LA-NPC.

View Source Full Text PDF

Topics

Journal Article

Prediction of Neoadjuvant Chemotherapy Efficacy for Locally Advanced Nasopharyngeal Carcinoma Using MRI-Based Deep Learning Features Combined with Vision Transformer.

Authors

Affiliations (3)

Abstract

Tags

Topics

Ready to Sharpen Your Edge?