Human-AI collaboration for ultrasound diagnosis of thyroid nodules: a clinical trial.
Authors
Affiliations (6)
Affiliations (6)
- Department of Otorhinolaryngology, Head and Neck Surgery and Audiology, Rigshospitalet University Hospital of Copenhagen, 2100, Copenhagen, Denmark. [email protected].
- Department of Otorhinolaryngology, Head and Neck Surgery and Audiology, Rigshospitalet University Hospital of Copenhagen, 2100, Copenhagen, Denmark.
- Institute of Clinical Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, Blegdamsvej 3, 2200, Copenhagen, Denmark.
- Department of Otorhinolaryngology, Head and Neck Surgery, Zealand University Hospital, 4600, Køge, Denmark.
- Copenhagen Academy of Medical Education and Simulation (CAMES), Rigshospitalet University Hospital of Copenhagen, 2100, Copenhagen, Denmark.
- Department of Obstetrics, Juliane Marie Centre, Rigshospitalet University Hospital of Copenhagen, 2100, Copenhagen, Denmark.
Abstract
This clinical trial examined how the articifial intelligence (AI)-based diagnostics system S-Detect for Thyroid influences the ultrasound diagnostic work-up of thyroid ultrasound (US) performed by different US users in clinical practice and how different US users influences the diagnostic accuracy of S-Detect. We conducted a clinical trial with 20 participants, including medical students, US novice physicians, and US experienced physicians. Five patients with thyroid nodules (one malignant and four benign) volunteered to undergo a thyroid US scan performed by all 20 participants using the same US systems with S-Detect installed. Participants performed a focused thyroid US on each patient case and made a nodule classification according to the European Thyroid Imaging Reporting And Data System (EU-TIRADS). They then performed a S-Detect analysis of the same nodule and were asked to re-evaluate their EU-TIRADS reporting. From the EU-TIRADS assessments by participants, we derived a biopsy recommendation outcome of whether fine needle aspiration biopsy (FNAB) was recommended. The mean diagnostic accuracy for S-Detect was 71.3% (range 40-100%) among all participants, with no significant difference between the groups (p = 0.31). The accuracy of our biopsy recommendation outcome was 69.8% before and 69.2% after AI for all participants (p = 0.75). In this trial, we did not find S-Detect to improve the thyroid diagnostic work-up in clinical practice among novice and intermediate ultrasound operators. However, the operator had a substantial impact on the AI-generated ultrasound diagnosis, with a variation in diagnostic accuracy from 40 to 100%, despite the same patients and ultrasound machines being used in the trial.