A Dual-Task Deep-Learning Model with Fused Ultrasound Images for Simultaneous Typing and Grading of Cystocele.
Authors
Affiliations (2)
Affiliations (2)
- Department of Ultrasound Imaging, Xiangya Hospital, Central South University, Changsha, 410008, Hunan Province, China.
- Department of Gynecology, Xiangya Hospital, Central South University, Changsha, 410008, Hunan Province, China. [email protected].
Abstract
We developed a dual-task deep-learning model, termed FD-Net, which utilizes fused two-dimensional (2D) and three-dimensional (3D) ultrasound images to simultaneously automate cystocele typing and grading, and evaluated its diagnostic performance. We retrospectively included 625 patients (467 cystocele, 158 normal). The model fused preprocessed two-dimensional (2D, resting and Valsalva) and three-dimensional (3D, levator hiatus) images as input. On the basis of a ResNet50 backbone, FD-Net performed both typing (normal, type I/II/III) and grading (normal, mild, significant) tasks. Its performance was compared against single-modal models using only 2D images (ST-Net for typing, SG-Net for grading). Evaluation metrics included accuracy, precision, recall, F1-score, and the area under the receiver operating characteristic curve (AUC), with model comparisons made using McNemar tests. On the test set (n = 188), compared with single-modal models using only 2D images, FD-Net achieved higher accuracy in typing (79.68% vs. ST-Net's 70.05%, P = 0.023) and grading (81.38% vs. SG-Net's 71.28%, P = 0.006). The F1-score improved notably for normal cases (from 64.94% to 85.44%) and mild cystocele (from 57.94% to 69.47%). For other key categories, FD-Net also attained high F1-scores of 87.50% for significant prolapse and 77.08% for type III. All AUC values exceeded 0.92. The dual-task model with image fusion accomplishes simultaneous cystocele typing and grading, showing higher diagnostic performance than single-modal models, and holds potential for clinical application.