Habitat heterogeneity analysis of planning CT improves the prediction of radiation proctitis in cervical cancer: A multimodal machine learning study.
Authors
Affiliations (8)
Affiliations (8)
- Department of Radiology, Jilin Cancer Hospital, Changchun, Jilin, China.
- Department of Radiation Oncology, Jilin Cancer Hospital, Changchun, Jilin, China.
- Department of Integrated Cardio-Oncology, Jilin Cancer Hospital, Changchun, Jilin, China.
- Department of Regenerative Medicine, School of Pharmaceutical Science, Jilin University, Changchun, Jilin, China.
- Department of Radiation Biology, School of Public Health, Jilin University, Changchun, Jilin, China.
- Jilin Provincial Institute of Cancer Prevention and Treatment, Jilin Cancer Hospital, Changchun, Jilin, China.
- Changchun University of Chinese Medicine, Changchun, Jilin, China.
- Department of Radiation Oncology, Jilin Cancer Hospital, Changchun, Jilin, China. Electronic address: [email protected].
Abstract
Radiation proctitis (RP) is a common complication following radiotherapy for cervical cancer, for which accurate pre-treatment prediction remains challenging. This study aimed to develop and validate a multimodal machine learning model that incorporates habitat heterogeneity analysis from planning CT images to predict RP risk at the individual level. In this retrospective study, 100 patients with cervical cancer treated with radiotherapy were enrolled (53 RP-positive, 47 RP-negative). A comprehensive dataset of 2783 multimodal features was collected, encompassing sociodemographics, clinical parameters, dosimetry, conventional radiomics, and novel habitat heterogeneity features derived from planning CT images. Key predictors were identified using the Least Absolute Shrinkage and Selection Operator (LASSO) regression. Four machine learning models (Logistic Regression, Random Forest, XGBoost, LightGBM) were optimised and compared via 10-fold cross-validation. The best-performing model was validated on an independent test set (n = 30). Model interpretability was analysed using Shapley Additive exPlanations (SHAP), and a clinical nomogram was constructed for individualised risk assessment. Logistic regression demonstrated the best predictive performance, achieving an area under the curve (AUC) of 0.929 on the training set (n = 70) and 0.869 on the independent test set. SHAP analysis identified Rectum-V50 as the most important predictor of RP, followed by habitat heterogeneity features such as Habitat2_Flatness. The model showed good calibration and provided significant clinical net benefit across a wide range of threshold probabilities (0.01-0.99) in decision curve analysis. The final nomogram, based on the 15 selected features, effectively stratified patients into low-, intermediate-, and high-risk groups. In the test set, all 15 patients (100%) stratified as high-risk developed RP. SHAP analysis confirmed intuitive, non-linear relationships between feature values and predicted risk. A multimodal machine learning model integrating features of habitat heterogeneity accurately predicts RP risk in patients with cervical cancer undergoing radiotherapy. The developed nomogram provides a clinically practical tool for individualised risk stratification, potentially facilitating early intervention.