Impact of a commercial artificial intelligence decision-support system on biparametric MRI interpretation for detecting clinically significant prostate cancer: a retrospective single-center prostatectomy-validated multi-reader study across experience levels.
Authors
Affiliations (4)
Affiliations (4)
- Department of Radiology, Kawasaki Medical School, 577 Matsushima, Kurashiki City, Okayama, Japan. Electronic address: [email protected].
- Department of Radiology, Kawasaki Medical School, 577 Matsushima, Kurashiki City, Okayama, Japan.
- Department of Radiology, Kawasaki Medical School, 577 Matsushima, Kurashiki City, Okayama, Japan; Department of Radiology, Radiolonet Tokai, Nagoya 460-8501, Japan.
- Philips Japan, 2-13-37 Konan, Minato-ku, Tokyo 108-8507, Japan.
Abstract
To determine whether QP-Prostate, a commercially available artificial intelligence (AI) decision-support system, improves Prostate Imaging Reporting and Data System (PI-RADS)-based detection of clinically significant prostate cancer (csPCa) using prostate MRI by readers with different levels of experience, without loss of specificity. This single-center retrospective study included 52 men with pathologically confirmed csPCa who underwent preoperative 3.0-T multiparametric magnetic resonance imaging (mpMRI) and radical prostatectomy between 2021 and 2023. To match AI inputs, four radiologists (1-25 years of experience) interpreted the PI-RADS findings using only biparametric components (T2WI and DWI) under three scenarios: reader-alone, reader + AI, and AI-alone. Per-case (evaluating sensitivity only) and per-lesion analyses were conducted, setting PI-RADS ≥ 3 as positive. The primary endpoint was the within-reader per-lesion AUC difference evaluated via DeLong's test with Holm adjustment; secondary metrics were assessed using DeLong and McNemar tests. In the per-case (index-lesion) analysis, AI assistance significantly increased sensitivity only for the least-experienced reader, from 0.73 to 0.87 (p = 0.046), whereas higher point estimates for the other readers were not statistically significant. In the per-lesion analysis, AI assistance significantly improved sensitivity for 3/4 readers and increased AUC for all readers (p-values ranging 0.002-0.027; within-reader ΔAUC, +0.013 to + 0.037; all 95% CIs excluded zero), with no statistically significant change in specificity (all p ≥ 0.061). AI-alone showed high specificity (0.94) and low sensitivity (0.34) in the per-lesion analysis, with a per-lesion AUC (0.64) closest to that of the least-experienced reader. In this small prostatectomy-enriched cohort, AI assistance was associated with improved lesion-level discrimination and increased sensitivity, with the greatest apparent benefit observed among less-experienced readers. These findings should be interpreted as proof-of-concept/feasibility data rather than as evidence of improved reader agreement or immediate clinical generalizability.