A Fusion Model of ResNet and Vision Transformer for Efficacy Prediction of HIFU Treatment of Uterine Fibroids.

September 10, 2025

papers DOI: 10.1016/j.acra.2025.08.054 PMID: 40935773

Authors

Zhou Y,Xu H,Jiang W,Zhang J,Chen S,Yang S,Xiang H,Hu W,Qiao X

Affiliations (6)

Department of Radiology, the Second Affiliated Hospital of Chongqing Medical University, Yuzhong 400010, Chongqing, China (Y.Z., H.X., W.H., X.Q.); Bioengineering College, Chongqing University, Chongqing 400044, China (Y.Z.). Electronic address: [email protected].
Department of Radiology, the Second Affiliated Hospital of Chongqing Medical University, Yuzhong 400010, Chongqing, China (Y.Z., H.X., W.H., X.Q.).
Department of Chinese Traditional Medicine and Rehabilitation, Chongqing Emergency Medical Center, Yuzhong 400014, Chongqing, China (W.J., H.X.).
Department of Ultrasound, Women and Children's Hospital of Chongqing Medical University, Chongqing 401147, China (J.Z.).
Department of Medical Equipment, People's Hospital Of Chongqing Liangjiang New Area, Chongqing 401147, China (S.C.).
Department of Radiology, Women and Children's Hospital of Chongqing Medical University, Chongqing 401147, China (S.Y.).

Abstract

High-intensity focused ultrasound (HIFU) is a non-invasive technique for treating uterine fibroids, and the accurate prediction of its therapeutic efficacy depends on precise quantification of the intratumoral heterogeneity. However, existing methods still have limitations in characterizing intratumoral heterogeneity, which restricts the accuracy of efficacy prediction. To this end, this study proposes a deep learning model with a parallel architecture of ResNet and ViT (Res-ViT) to verify whether the synergistic characterization of local texture and global spatial features can improve the accuracy of HIFU efficacy prediction. This study enrolled patients with uterine fibroids who underwent HIFU treatment from Center A (training set: N = 272; internal validation set: N = 92) and Center B (external test set: N = 125). Preoperative T2-weighted magnetic resonance images were used to develop the Res-ViT model for predicting immediate post-treatment non-perfused volume ratio (NPVR) ≥ 80%. Model performance was evaluated using the area under the receiver operating characteristic curve (AUC) and compared against independent Radiomics, ResNet-18, and ViT models. The Res-ViT model outperformed all standalone models across both internal (AUC = 0.895, 95% CI: 0.857-0.987) and external (AUC = 0.853, 95% CI: 0.776-0.921) test sets. SHAP analysis identified the ResNet branch as the predominant decision-making component (feature contribution: 55.4%). The visualization of Gradient-weighted Class Activation Mapping (Grad-CAM) shows that the key regions attended by Res-ViT have higher spatial overlap with the postoperative non-ablated fibroid tissue. The proposed Res-ViT model demonstrates that the fusion strategy of local and global features is an effective method for quantifying uterine fibroid heterogeneity, significantly enhancing the accuracy of HIFU efficacy prediction.

View Source Full Text PDF

Topics

Journal Article

A Fusion Model of ResNet and Vision Transformer for Efficacy Prediction of HIFU Treatment of Uterine Fibroids.

Authors

Affiliations (6)

Abstract

Tags

Topics

Ready to Sharpen Your Edge?