External Validation of a Deep Learning-Based Artificial Intelligence System for Ultrasound Diagnosis of Thyroid Nodules: A Two-Center Retrospective Study.

June 17, 2026

papers

DOI: 10.1002/jcu.70275 PMID: 42304967

Authors

Tang Y,Xu YD,Zhao CK,Fan PL,Jin YJ,Ji ZB,Han H,Xu HX,Xu BH

Affiliations (3)

Department of Ultrasound, Zhongshan Hospital, Fudan University, Shanghai, China.
Institute of Ultrasound in Medicine and Engineering, Fudan University, Shanghai, China.
Shanghai Institute of Medical Imaging, Shanghai, China.

Abstract

To investigate the performance of an artificial intelligence (AI) diagnostic system for thyroid nodule sonography based on deep learning convolutional neural network (CNN). We retrospectively included 485 thyroid nodules with definite pathology in two tertiary hospitals. The AI diagnostic system was constructed for automatic detection and diagnosis of nodules based on deep learning CNN equipped with image mode and video mode. One gray-scale ultrasound (US) image of each nodule from the two hospitals was selected for diagnosis in image mode (AI modelimg). A US video of each nodule from the second hospital was analyzed in video mode (AI modelvid). Performance of AI modelimg, AI modelvid, and three radiologists with 3-15 years of US experience was evaluated. Sonographic features probably influencing the accuracy of AI modelimg were screened out by binary logistic regression analysis. Although the experienced radiologist achieved highest sensitivity, accuracy and the area under the receiver operating characteristic curve (AUC) compared to AI modelimg and two junior radiologists, there was no significant difference between AUCs of AI modelimg and experienced radiologist (0.770 [0.718-0.816] vs. 0.799 [0.750-0.843] in first hospital dataset, p = 0.253; 0.731 [0.660-0.794] vs. 0.780 [0.712-0.838] in second hospital dataset, p = 0.105). When US videos were used for diagnosis instead of images, significantly higher specificity (0.575 [0.489-0.661] vs. 0.693 [0.613-0.773], p = 0.003), accuracy (0.667 [0.598-0.736] vs. 0.744 [0.681-0.808], p = 0.002) and AUC (0.731 [0.660-0.794] vs. 0.780 [0.713-0.839], p = 0.016) were achieved by AI modelvid. AI modelimg was more likely to make a correct diagnosis in benign nodules with circumscribed margin (OR = 3.46, p = 0.003), hyperechoic or isoechoic echogenicity (OR = 8.83, p < 0.001) and none of echogenic foci or with large comet-tail artifacts (OR = 2.28, p = 0.041). Respectively, AI modelimg acquired higher accuracy in malignant nodules with hypoechoic or very hypoechoic echogenicity (OR = 3.33, p = 0.034) and irregular margin (OR = 4.51, p = 0.003). In TR1 and TR2 (ACR TI-RADS risk level) nodules, accuracy of AI modelimg was 100% (6 of 6) and 90% (27 of 30). The AI diagnostic system is feasible and reliable in automatic detection and diagnosis of thyroid nodules and acquires superior performance applied in US videos. Sonographic features of thyroid nodules are a crucial factor influencing the accuracy of the AI model.

View Source Full Text PDF

Topics

Journal Article

External Validation of a Deep Learning-Based Artificial Intelligence System for Ultrasound Diagnosis of Thyroid Nodules: A Two-Center Retrospective Study.

Authors

Affiliations (3)

Abstract

Tags

Topics

Ready to Sharpen Your Edge?