Assessing the Accuracy of Artificial Intelligence in Detecting Intracranial Aneurysms in a Clinical Setting Relative to Neuroradiologists.
Authors
Affiliations (5)
Affiliations (5)
- Departments of Radiology (B.J.A., M.H.D.H., N.F., N.C., C.Z., P.W., M.R.L., M.M.-B.), University of Washington, Seattle, Washington [email protected].
- From the Department of Radiology (M.H.D.H., M.M.-B.) University of Alabama at Birmingham, Birmingham, Alabama.
- Departments of Radiology (B.J.A., M.H.D.H., N.F., N.C., C.Z., P.W., M.R.L., M.M.-B.), University of Washington, Seattle, Washington.
- Department of Biostatistics (E.L.), University of Washington, Seattle, Washington.
- Department of Neurosurgery (M.R.L.), University of Washington, Seattle, Washington.
Abstract
Intracranial aneurysms (IAs), detected in 2%-5% of the population, represent a major health care issue because ruptured aneurysms with resultant hemorrhage are associated with severe morbidity or mortality. With the increasing role of artificial intelligence (AI) in diagnostic radiology, we assessed the accuracy of a commercial AI tool (Aidoc) to detect intracranial aneurysms on head or head/neck CTA relative to fellowship-trained neuroradiologists. We retrospectively extracted CTA head or head/neck studies from University of Washington Medical Center's clinical database between November 1, 2018 and November 2, 2021; these were analyzed with Aidoc for evaluation of aneurysm presence. Concordance or discrepancies between AI and the neuroradiology reports were further adjudicated by 3 neuroradiologists for consensus. IA features including size, morphology, and site of origin were extracted for each positive case. Correlation between AI and neuroradiologist performance was assessed, and a vascular neurosurgeon independently reviewed neuroradiology false-negatives to determine IA management based on image and patient-specific features. Comparative analyses were also performed per Aidoc's intended use criteria, ie, "unruptured," "saccular" IAs greater than 5 mm in size. A total of 2534 CTA scans were reviewed for IAs; 252 were positive with 315 IAs (1.25 aneurysms per positive CTA). AI achieved sensitivity, specificity, and accuracy of 70.5% (95% CI, 65.1%-75.5%), 98.6% (95% CI, 98.0%-99.0%), and 95.1% (95% CI, 94.3%-95.9%), respectively; while neuroradiologists' performance were 94.0% (95% CI, 90.7%-96.3%), 98.3% (95% CI, 97.7%-98.8%), and 97.8% (95% CI, 97.1%-98.3%), respectively. In the cohort, 35 IAs were within the intended use criteria for Aidoc, and here AI achieved sensitivity, specificity, and accuracy of 85.7% (95% CI, 69.7%-95.2%), 98.6% (95% CI, 98.0%-99.0%), and 98.4% (95% CI, 97.8%-98.8%) while neuroradiologists achieved 97.1% (95% CI, 85.1%-99.9%), 98.3% (95% CI, 97.7%-98.8%), and 98.3% (95% CI, 97.7%-98.8%), respectively. Our multisite study showed that neuroradiologists performed better than AI for IA detection in terms of sensitivity and accuracy, while while achieving comparable specificity.