Artificial Intelligence for Ischemic Stroke Detection in Non-contrast CT: A Systematic Review and Meta-analysis.
Authors
Affiliations (3)
Affiliations (3)
- Department of Radiology and Nuclear Medicine, Xuanwu Hospital, Capital Medical University, No. 45 Changchun Street, Xicheng District, Beijing 100053, China (W.S., J.L.); Beijing Key Laboratory of Magnetic Resonance Imaging and Brain Informatics, Beijing 100053, China (W.S., J.L.).
- Center for Medical Ultrasound, The Affiliated Suzhou Hospital of Nanjing Medical University, Suzhou Municipal Hospital, Suzhou, China (J.P.).
- Department of Radiology and Nuclear Medicine, Xuanwu Hospital, Capital Medical University, No. 45 Changchun Street, Xicheng District, Beijing 100053, China (W.S., J.L.); Beijing Key Laboratory of Magnetic Resonance Imaging and Brain Informatics, Beijing 100053, China (W.S., J.L.). Electronic address: [email protected].
Abstract
We aim to conduct a systematic review and meta-analysis to objectively assess the diagnostic accuracy of artificial intelligence (AI) models for detecting ischemic stroke (IS) in non-contrast CT (NCCT), and to compare the diagnostic performance between AI and clinicians. Until February 2025, systematic searches were conducted in PubMed, Web of Science, Cochrane, IEEE Xplore, and Embase for studies using AI based on NCCT images from human subjects for IS detection or classification. The risk of bias was evaluated using the prediction model study risk of bias assessment tool (PROBAST). For meta-analysis, the pooled sensitivities, specificities, and hierarchical summary receiver operating characteristic (HSROC) curves were used. A total of 38 studies, with 74 trials extracted from 32 studies were included. For AI performance, the pooled sensitivity and specificity were 91.2% (95%CI: 87.6%-93.8%) and 96.0% (95%CI: 93.6%-97.6%) for internal validation and 59.8% (95%CI:39.9%-76.9%) and 97.3% (95%CI: 93.2%-98.9%) for external validation. For clinicians' performance, the pooled sensitivity and specificity were 44.1% (95%CI: 33.8%-55.0%) and 85.5% (95%CI: 68.4%-94.1%) for internal validation and 46.1% (95%CI: 31.5%-61.3%) and 83.6% (95%CI: 62.8%-93.9%) for external validation. The pooled sensitivity and specificity increased to 83.7% (95%CI: 53.0%-95.9%) and 86.7% (95%CI: 77.1%-92.6%) for clinicians with AI assistance. The subgroup analysis results indicated that higher model sensitivity was associated with the data augmentation (93.9%, 95%CI: 90.2%-96.2%) and transfer learning (94.7%, 95%CI: 92.0%-96.6%). There were 22 of 38 (58%) studies that were judged to have high risk of bias. Sensitive analysis and subgroup analysis identified multiple sources of heterogeneity in the data, including risk of bias and AI model types. Our study reveals that AI has an acceptable performance in detecting IS in NCCT in internal validation, although significant heterogeneity was observed in the meta-analysis. However, the generalizability and practical applicability of AI in real-world clinical settings remain limited due to insufficient external validation.