A Deep Learning-Based Artificial Intelligence Model Assisting Thyroid Nodule Diagnosis and Management: Pilot Results for Evaluating Thyroid Malignancy in Pediatric Cohorts.
Authors
Affiliations (6)
Affiliations (6)
- Department of Radiology, Lucile Packard Children's Hospital, Stanford University, Palo Alto, California, USA.
- Department of Radiology, Ajou University School of Medicine, Suwon, Korea.
- Department of Radiology, Stanford University School of Medicine, Stanford, CA, USA.
- Department of Neurosurgery, Stanford University School of Medicine, Stanford, CA, USA.
- Department of Radiology, Phoenix Children's Hospital, Phoenix, AZ, USA.
- Department of Otolaryngology-Head & Neck Surgery, Children's Thyroid Clinic, Lucile Packard Children's Hospital, Stanford University, Palo Alto, California, USA.
Abstract
<b><i>Purpose:</i></b> Artificial intelligence (AI) models have shown promise in predicting malignant thyroid nodules in adults; however, research on deep learning (DL) for pediatric cases is limited. We evaluated the applicability of a DL-based model for assessing thyroid nodules in children. <b><i>Methods:</i></b> We retrospectively identified two pediatric cohorts (<i>n</i> = 128; mean age 15.5 ± 2.4 years; 103 girls) who had thyroid nodule ultrasonography (US) with histological confirmation at two institutions. The AI-Thyroid DL model, originally trained on adult data, was tested on pediatric nodules in three scenarios axial US images, longitudinal US images, and both. We conducted a subgroup analysis based on the two pediatric cohorts and age groups (≥14 years vs. < 14 years) and compared the model's performance with radiologist interpretations using the Thyroid Imaging Reporting and Data System (TIRADS). <b><i>Results:</i></b> Out of 156 nodules analyzed, 47 (30.1%) were malignant. AI-Thyroid demonstrated respective area under the receiver operating characteristic (AUROC), sensitivity, and specificity values of 0.913-0.929, 78.7-89.4%, and 79.8-91.7%, respectively. The AUROC values did not significantly differ across the image planes (all <i>p</i> > 0.05) and between the two pediatric cohorts (<i>p</i> = 0.804). No significant differences were observed between age groups in terms of sensitivity and specificity (all <i>p</i> > 0.05) while the AUROC values were higher for patients aged <14 years compared to those aged ≥14 years (all <i>p</i> < 0.01). AI-Thyroid yielded the highest AUROC values, followed by ACR-TIRADS and K-TIRADS (<i>p</i> = 0.016 and <i>p</i> < 0.001, respectively). <b><i>Conclusion:</i></b> AI-Thyroid demonstrated high performance in diagnosing pediatric thyroid cancer. Future research should focus on optimizing AI-Thyroid for pediatric use and exploring its role alongside tissue sampling in clinical practice.