Back to all papers

Diagnostic performance of commercial AI systems versus participating radiologists for pulmonary nodule detection in routine clinical practice.

June 18, 2026pubmed logopapers

Authors

Watanabe H,Morisaka H,Aoyagi K,Tozuka R,Wumu T,Ii T,Sakamoto K,Yato T,Ichikawa S

Affiliations (5)

  • Department of Diagnostic Radiology, University of Yamanashi, 1110, Shimokato, Chuo, Yamanashi, 409-3898, Japan. [email protected].
  • Department of Diagnostic Radiology, University of Yamanashi, 1110, Shimokato, Chuo, Yamanashi, 409-3898, Japan.
  • Department of Therapeutic Radiology, University of Yamanashi, Chuo, Yamanashi, Japan.
  • Department of Diagnostic Radiology, Yamanashi Central Hospital, Kofu, Yamanashi, Japan.
  • Division of Radiology, University of Yamanashi Hospital, Chuo, Yamanashi, Japan.

Abstract

To evaluate the diagnostic performance of two commercial artificial intelligence (AI) systems versus that of radiologists on routine clinical chest computed tomography (CT) and to identify the imaging characteristics that limit AI performance. We retrospectively analyzed the 5-mm-slice chest CT of 102 patients (353 nodules or masses). The detection performance of two board-certified radiologists and two commercial AI systems was compared against an expert-established reference standard. Sensitivity, false positives (FPs) per scan, and positive predictive value (PPV) were evaluated. Logistic regression identified factors influencing detection. The radiologists demonstrated higher sensitivity (90.4-94.3%) than the AI systems (80.2-89.2%) and significantly fewer FPs per scan (1.61-3.56 vs. 5.28-7.97; p < 0.001), resulting in superior PPVs. Multivariate analysis revealed divergent limitations: radiologists were challenged by intrinsic features (e.g., ground-glass nodules [GGNs]), whereas AI performance was degraded by case-level complexity (multiple nodules), specific locations (central), and atypical morphologies (large masses). In this study, the participating radiologists outperformed the commercial AI systems in routine clinical settings. Both exhibited distinct weakness profiles. Current AI systems are best suited as complementary tools rather than autonomous readers, provided FPs are managed effectively.

Topics

Journal Article

Ready to Sharpen Your Edge?

Subscribe to join 11k+ peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.