A stacking ensemble framework integrating radiomics and deep learning for prognostic prediction in head and neck cancer.
Authors
Affiliations (11)
Affiliations (11)
- Department of Electrical and Electronic Engineering, Faculty of Engineering, Universiti Putra Malaysia, Serdang, Malaysia.
- Department of Biomedical Engineering, Chengde Medical University, Chengde City, Hebei Province, China.
- Department of Nursing, Faculty of Medicine and Health Sciences, Universiti Putra Malaysia, Serdang, Malaysia.
- Department of Nursing, Chengde Central Hospital, Chengde city, Hebei Province, China.
- Department of Medical Engineering, Tianjin Armed Police Corps Hospital, Tianjin Municipality, China.
- Department of Radiology, the Affiliated Hospital of Chengde Medical University, Chengde City, Hebei Province, China.
- Institute of Artificial Intelligence, University of Science and Technology Beijing, Beijing, China.
- Department of Electrical and Electronic Engineering, Faculty of Engineering, Universiti Putra Malaysia, Serdang, Malaysia. [email protected].
- Department of Biomedical Engineering, Chengde Medical University, Chengde City, Hebei Province, China. [email protected].
- Hebei Key Laboratory of Nerve Injury and Repair, Chengde Medical University, Chengde City, Hebei, China. [email protected].
- Hebei International Research Center of Medical Engineering, Chengde Medical University, Hebei, China. [email protected].
Abstract
Radiomics models frequently face challenges related to reproducibility and robustness. To address these issues, we propose a multimodal, multi-model fusion framework utilizing stacking ensemble learning for prognostic prediction in head and neck cancer (HNC). This approach seeks to improve the accuracy and reliability of survival predictions. A total of 806 cases from nine centers were collected; 143 cases from two centers were assigned as the external validation cohort, while the remaining 663 were stratified and randomly split into training (n = 530) and internal validation (n = 133) sets. Radiomics features were extracted according to IBSI standards, and deep learning features were obtained using a 3D DenseNet-121 model. Following feature selection, the selected features were input into Cox, SVM, RSF, DeepCox, and DeepSurv models. A stacking fusion strategy was employed to develop the prognostic model. Model performance was evaluated using Kaplan-Meier survival curves and time-dependent ROC curves. On the external validation set, the model using combined PET and CT radiomics features achieved superior performance compared to single-modality models, with the RSF model obtaining the highest concordance index (C-index) of 0.7302. When using deep features extracted by 3D DenseNet-121, the PET + CT-based models demonstrated significantly improved prognostic accuracy, with Deepsurv and DeepCox achieving C-indices of 0.9217 and 0.9208, respectively. In stacking models, the PET + CT model using only radiomics features reached a C-index of 0.7324, while the deep feature-based stacking model achieved 0.9319. The best performance was obtained by the multi-feature fusion model, which integrated both radiomics and deep learning features from PET and CT, yielding a C-index of 0.9345. Kaplan-Meier survival analysis further confirmed the fusion model's ability to distinguish between high-risk and low-risk groups. The stacking-based ensemble model demonstrates superior performance compared to individual machine learning models, markedly improving the robustness of prognostic predictions.