Development and evaluation of a multimodal feature-based predictive model for radiotherapy-induced oral mucositis in nasopharyngeal carcinoma.
Authors
Affiliations (3)
Affiliations (3)
- Department of Radiotherapy, Affiliated Hospital of Guangdong Medical University, Zhanjiang, Guangdong, China.
- School of Medical Imaging, Laboratory and Rehabilitation, Xiangnan University, Chenzhou, Hunan, China.
- Department of Head and Neck Oncology, Affiliated Hospital of Guangdong Medical University, Zhanjiang, Guangdong, China.
Abstract
Accurate prediction of radiation-induced oral mucositis is crucial for personalized treatment in head and neck cancer. However, developing robust predictive models utilizing high-dimensional multimodal data (CT imaging, dose distribution, and clinical features) remains challenging, particularly in cohorts with limited sample sizes. This study aimed to rigorously evaluate and compare the multi-class predictive performance of traditional machine learning algorithms and deep learning architectures under a small-cohort setting. Multimodal data from 108 patients were collected. A comprehensive evaluation framework incorporating nine traditional machine learning algorithms and two deep learning models (a dimensionality-reduced 1D-CNN and a multimodal 3D-CNN) was established. To ensure robust evaluation, a stratified 5-fold cross-validation was employed. Model performance was comprehensively quantified using mean ± standard deviation (SD) across multiple metrics, including the Area Under the Curve (AUC), accuracy, and Matthews Correlation Coefficient (MCC). Inter-rater reliability for RIOM grading was excellent (Cohen's kappa = 0.82, 95% CI: 0.73-0.91). Among traditional machine learning approaches, the Extra Trees (ET) algorithm achieved the highest discriminative capacity (AUC: 0.956 ± 0.046), while Logistic Regression (LR) demonstrated optimal overall accuracy (0.832 ± 0.155) and stability. Regarding deep learning, the lightweight 1D-CNN utilizing fused low-dimensional features exhibited highly competitive and robust performance (AUC: 0.900 ± 0.072; Accuracy: 0.732 ± 0.140). In stark contrast, the high-dimensional multimodal 3D-CNN suffered from severe overfitting and mode collapse phenomenon, yielding significantly inferior results (AUC: 0.568 ± 0.090; MCC: -0.025 ± 0.031). For small-cohort radiomics and dosimetric analyses, ensemble learning models (e.g., ET) and appropriately regularized linear models (e.g., LR) remain highly effective. While deep learning holds promise, high-dimensional architectures like 3D-CNNs are highly susceptible to mode collapse without massive datasets. Instead, employing feature dimensionality reduction combined with lightweight networks (1D-CNN) is a vastly superior strategy to extract reliable predictive patterns from limited clinical data.