Sort by:
Page 1 of 440 results
Next

Artificial intelligence in the diagnosis of multiple sclerosis using brain imaging modalities: A systematic review and meta-analysis of algorithms.

Darrudi R, Hosseini A, Emami H, Roshanpoor A, Nahayati MA

pubmed logopapersSep 19 2025
Multiple sclerosis (MS) diagnosis remains challenging due to its heterogeneous clinical manifestations and the absence of a definitive diagnostic test. Conventional magnetic resonance imaging, while central to diagnosis, faces limitations in specificity and inter-rater variability. Artificial intelligence offers promising solutions for enhancing medical imaging analysis in MS, yet its efficacy requires systematic validation. This systematic review and meta-analysis followed Preferred Reporting Items for Systematic Review and Meta-Analysis guidelines. We searched Embase, PubMed, Web of Science, Scopus, Google Scholar, and gray literature (inception to January 5, 2025) for case-control studies applying AI to magnetic resonance imaging-based MS diagnosis. A random-effects model pooled sensitivity, specificity, and accuracy. Heterogeneity was assessed via the Q-statistic and I². Meta-regression evaluated pixel count impact. Meta-analysis revealed pooled sensitivity, specificity, and accuracy of 93%, 95%, and 94%, respectively, showcasing the efficacy of AI models in MS diagnosis. Additionally, meta-regression analysis showed no significant correlation between the number of pixels and diagnostic performance parameters. Sensitivity analysis confirmed the robustness of results, while publication bias assessment indicated no evidence of bias. AI-based algorithms show promise in augmenting traditional diagnostic approaches for MS, offering accurate and timely diagnosis. Further research is warranted to standardize AI methodologies and optimize their integration into clinical practice. This study contributes to the growing evidence supporting AI's role in enhancing diagnostics and patient care in MS.

Head-to-Head Comparison of Two AI Computer-Aided Triage Solutions for Detecting Intracranial Hemorrhage on Non-Contrast Head CT.

Garcia GM, Young P, Dawood L, Elshikh M

pubmed logopapersSep 16 2025
This study aims to provide a comprehensive comparison of the performance and reproducibility of two commercially available artificial intelligence (AI) software computer-aided triage and notification solutions, Vendor A (Aidoc) and Vendor B (Viz.ai), for the detection of intracranial hemorrhage (ICH) on non-contrast enhanced head CT (NCHCT) scans performed within a single academic institution. The retrospective analysis was conducted on a large patient cohort from multiple healthcare settings within a single academic institution, utilizing standardized scanning protocols. Sensitivity, specificity, false positive, and false negative rates were evaluated for both vendors. Outputs assessed included AI-generated case-level classification. Among 4,081 scans, 595 were positive for ICH. Vendor A demonstrated a sensitivity of 94.4% and specificity of 97.4%, PPV of 85.9%, and NPV of 99.1%. Vendor B showed a sensitivity of 59.5% and specificity of 99.0%, PPV of 90.0%, and NPV of 92.6%. Vendor A had 20 false negatives, which primarily involved subdural and intraparenchymal hemorrhages, and 97 false positives, which appear to be related to motion artifact. Vendor B had 145 false negatives, largely comprised of subdural and subarachnoid hemorrhages, and 36 false positives, which appeared to be related to motion artifact and calcified or dense lesions. Concordantly, 18 cases were false negatives and 11 cases were false positives for both AI solutions. The findings of this study provide valuable information for clinicians and healthcare institutions considering the implementation of AI software for computer aided-triage and notification in the detection of intracranial hemorrhage. The discussion encompasses the implications of the results, the importance of evaluating AI findings in context-especially in the absence of explainability tools, potential areas for improvement, and the relevance of standardized scanning protocols in ensuring the reliability of AI-based diagnostic tools in clinical practice. ICH = Intracranial Hemorrhage; NCHCT = Non-contrast Enhanced Head CT; AI = Artificial Intelligence; SDH = Subdural Hemorrhage; SAH = Subarachnoid Hemorrhage; IPH = Intraparenchymal Hemorrhage; IVH = Intraventricular Hemorrhage; PPV = Positive Predictive Value; NPV = Negative Predictive Value; CADt = Computer-Aided Triage; PACS = Picture Archiving and Communication System; FN = False Negative; FP = False Positive; CI = Confidence Interval.

Mammographic features in screening mammograms with high AI scores but a true-negative screening result.

Koch HW, Bergan MB, Gjesvik J, Larsen M, Bartsch H, Haldorsen IHS, Hofvind S

pubmed logopapersSep 16 2025
BackgroundThe use of artificial intelligence (AI) in screen-reading of mammograms has shown promising results for cancer detection. However, less attention has been paid to the false positives generated by AI.PurposeTo investigate mammographic features in screening mammograms with high AI scores but a true-negative screening result.Material and MethodsIn this retrospective study, 54,662 screening examinations from BreastScreen Norway 2010-2022 were analyzed with a commercially available AI system (Transpara v. 2.0.0). An AI score of 1-10 indicated the suspiciousness of malignancy. We selected examinations with an AI score of 10, with a true-negative screening result, followed by two consecutive true-negative screening examinations. Of the 2,124 examinations matching these criteria, 382 random examinations underwent blinded consensus review by three experienced breast radiologists. The examinations were classified according to mammographic features, radiologist interpretation score (1-5), and mammographic breast density (BI-RADS 5th ed. a-d).ResultsThe reviews classified 91.1% (348/382) of the examinations as negative (interpretation score 1). All examinations (26/26) categorized as BI-RADS d were given an interpretation score of 1. Classification of mammographic features: asymmetry = 30.6% (117/382); calcifications = 30.1% (115/382); asymmetry with calcifications = 29.3% (112/382); mass = 8.9% (34/382); distortion = 0.8% (3/382); spiculated mass = 0.3% (1/382). For examinations with calcifications, 79.1% (91/115) were classified with benign morphology.ConclusionThe majority of false-positive screening examinations generated by AI were classified as non-suspicious in a retrospective blinded consensus review and would likely not have been recalled for further assessment in a real screening setting using AI as a decision support.

Accuracy of AI-Based Algorithms in Pulmonary Embolism Detection on Computed Tomographic Pulmonary Angiography: An Updated Systematic Review and Meta-analysis.

Nabipoorashrafi SA, Seyedi A, Bahri RA, Yadegar A, Shomal-Zadeh M, Mohammadi F, Afshari SA, Firoozeh N, Noroozzadeh N, Khosravi F, Asadian S, Chalian H

pubmed logopapersSep 15 2025
Several artificial intelligence (AI) algorithms have been designed for detection of pulmonary embolism (PE) using computed tomographic pulmonary angiography (CTPA). Due to the rapid development of this field and the lack of an updated meta-analysis, we aimed to systematically review the available literature about the accuracy of AI-based algorithms to diagnose PE via CTPA. We searched EMBASE, PubMed, Web of Science, and Cochrane for studies assessing the accuracy of AI-based algorithms. Studies that reported sensitivity and specificity were included. The R software was used for univariate meta-analysis and drawing summary receiver operating characteristic (sROC) curves based on bivariate analysis. To explore the source of heterogeneity, sub-group analysis was performed (PROSPERO: CRD42024543107). A total of 1722 articles were found, and after removing duplicated records, 1185 were screened. Twenty studies with 26 AI models/population met inclusion criteria, encompassing 11,950 participants. Univariate meta-analysis showed a pooled sensitivity of 91.5% (95% CI 85.5-95.2) and specificity of 84.3 (95% CI 74.9-90.6) for PE detection. Additionally, in the bivariate sROC, the pooled area under the curved (AUC) was 0.923 out of 1, indicating a very high accuracy of AI algorithms in the detection of PE. Also, subgroup meta-analysis showed geographical area as a potential source of heterogeneity where the I<sup>2</sup> for sensitivity and specificity in the Asian article subgroup were 60% and 6.9%, respectively. Findings highlight the promising role of AI in accurately diagnosing PE while also emphasizing the need for further research to address regional variations and improve generalizability.

Application of Deep Learning for Predicting Hematoma Expansion in Intracerebral Hemorrhage Using Computed Tomography Scans: A Systematic Review and Meta-Analysis of Diagnostic Accuracy.

Ahmadzadeh AM, Ashoobi MA, Broomand Lomer N, Elyassirad D, Gheiji B, Vatanparast M, Bathla G, Tu L

pubmed logopapersSep 11 2025
We aimed to systematically review the studies that utilized deep learning (DL)-based networks to predict hematoma expansion (HE) in patients with intracerebral hemorrhage (ICH) using computed tomography (CT) images. We carried out a comprehensive literature search across four major databases to identify relevant studies. To evaluate the quality of the included studies, we used both the Quality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2) and the METhodological RadiomICs Score (METRICS) checklists. We then calculated pooled diagnostic estimates and assessed heterogeneity using the I<sup>2</sup> statistic. To assess the sources of heterogeneity, effects of individual studies, and publication bias, we performed subgroup analysis, sensitivity analysis, and Deek's asymmetry test. Twenty-two studies were included in the qualitative synthesis, of which 11 and 6 were utilized for exclusive DL and combined DL meta-analyses, respectively. We found pooled sensitivity of 0.81 and 0.84, specificity of 0.79 and 0.91, positive diagnostic likelihood ratio (DLR) of 3.96 and 9.40, negative DLR of 0.23 and 0.18, diagnostic odds ratio of 16.97 and 53.51, and area under the curve of 0.87 and 0.89 for exclusive DL-based and combined DL-based models, respectively. Subgroup analysis revealed significant inter-group differences according to the segmentation technique and study quality. DL-based networks showed strong potential in accurately identifying HE in ICH patients. These models may guide earlier targeted interventions such as intensive blood pressure control or administration of hemostatic drugs, potentially leading to improved patient outcomes.

AI-driven and Traditional Radiomic Model for Predicting Muscle Invasion in Bladder Cancer via Multi-parametric Imaging: A Systematic Review and Meta-analysis.

Wang Z, Shi H, Wang Q, Huang Y, Feng M, Yu L, Dong B, Li J, Deng X, Fu S, Zhang G, Wang H

pubmed logopapersSep 5 2025
This study systematically evaluates the diagnostic performance of artificial intelligence (AI)-driven and conventional radiomics models in detecting muscle-invasive bladder cancer (MIBC) through meta-analytical approaches. Furthermore, it investigates their potential synergistic value with the Vesical Imaging-Reporting and Data System (VI-RADS) and assesses clinical translation prospects. This study adhered to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. We conducted a comprehensive systematic search of PubMed, Web of Science, Embase, and Cochrane Library databases up to May 13, 2025, and manually screened the references of included studies. The quality and risk of bias of the selected studies were assessed using the Quality Assessment of Diagnostic Accuracy Studies 2 (QUADAS-2) and Radiomics Quality Score (RQS) tools. We pooled the area under the curve (AUC), sensitivity, specificity, positive likelihood ratio (PLR), negative likelihood ratio (NLR), diagnostic odds ratio (DOR), and their 95% confidence intervals (95% CI). Additionally, meta-regression and subgroup analyses were performed to identify potential sources of heterogeneity. This meta-analysis incorporated 43 studies comprising 9624 patients. The majority of included studies demonstrated low risk of bias, with a mean RQS of 18.89. Pooled analysis yielded an AUC of 0.92 (95% CI: 0.89-0.94). The aggregate sensitivity and specificity were both 0.86 (95% CI: 0.84-0.87), with heterogeneity indices of I² = 43.58 and I² = 72.76, respectively. The PLR was 5.97 (95% CI: 5.28-6.75, I² = 64.04), while the NLR was 0.17 (95% CI: 0.15-0.19, I² = 37.68). The DOR reached 35.57 (95% CI: 29.76-42.51, I² = 99.92). Notably, all included studies exhibited significant heterogeneity (P < 0.1). Meta-regression and subgroup analyses identified several significant sources of heterogeneity, including: study center type (single-center vs. multi-center), sample size (<100 vs. ≥100 patients), dataset classification (training, validation, testing, or ungrouped), imaging modality (computed tomography [CT] vs. magnetic resonance imaging [MRI]), modeling algorithm (deep learning vs. machine learning vs. other), validation methodology (cross-validation vs. cohort validation), segmentation method (manual vs. [semi]automated), regional differences (China vs. other countries), and risk of bias (high vs. low vs. unclear). AI-driven and traditional radiomic models have exhibited robust diagnostic performance for MIBC. Nevertheless, substantial heterogeneity across studies necessitates validation through multinational, multicenter prospective cohort studies to establish external validity.

Impact of pre-test probability on AI-LVO detection: a systematic review of LVO prevalence across clinical contexts.

Olivé-Gadea M, Mayol J, Requena M, Rodrigo-Gisbert M, Rizzo F, Garcia-Tornel A, Simonetti R, Diana F, Muchada M, Pagola J, Rodriguez-Luna D, Rodriguez-Villatoro N, Rubiera M, Molina CA, Tomasello A, Hernandez D, de Dios Lascuevas M, Ribo M

pubmed logopapersAug 31 2025
Rapid identification of large vessel occlusion (LVO) in acute ischemic stroke (AIS) is essential for reperfusion therapy. Screening tools, including Artificial Intelligence (AI) based algorithms, have been developed to accelerate detection but rely heavily on pre-test LVO prevalence. This study aimed to review LVO prevalence across clinical contexts and analyze its impact on AI-algorithm performance. We systematically reviewed studies reporting consecutive suspected AIS cohorts. Cohorts were grouped into four clinical scenarios based on patient selection criteria: (a) high suspicion of LVO by stroke specialists (direct-to-angiosuite candidates), (b) high suspicion of LVO according to pre-hospital scales, (c) and (d) any suspected AIS without considering severity cut-off in a hospital or pre-hospital setting, respectively. We analyzed LVO prevalence in each scenario and assessed the false discovery rate (FDR) - number of positive studies needed to encounter a false positive, if applying eight commercially available LVO-detecting algorithms. We included 87 cohorts from 80 studies. Median LVO prevalence was: (a) 84% (77-87%), (b) 35% (26-42%), (c) 19% (14-25%), and (d) 14% (8-22%). At high prevalence levels: (a) FDR ranged between 0.007 (1 false positive in 142 positives) and 0.023 (1 in 43), whereas in low prevalence scenarios (Ccand d), FDR ranged between 0.168 (1 in 6) and 0.543 (over 1 in 2). To ensure meaningful clinical impact, AI algorithms must be evaluated within the specific populations and care pathways where they are applied.

ESR Essentials: artificial intelligence in breast imaging-practice recommendations by the European Society of Breast Imaging.

Schiaffino S, Bernardi D, Healy N, Marino MA, Romeo V, Sechopoulos I, Mann RM, Pinker K

pubmed logopapersAug 26 2025
Artificial intelligence (AI) can enhance the diagnostic performance of breast cancer imaging and improve workflow optimization, potentially mitigating excessive radiologist workload and suboptimal diagnostic accuracy. AI can also boost imaging capabilities through individual risk prediction, molecular subtyping, and neoadjuvant therapy response predictions. Evidence demonstrates AI's potential across multiple modalities. The most robust data come from mammographic screening, where AI models improve diagnostic accuracy and optimize workflow, but rigorous post-market surveillance is required before any implementation strategy in this field. Commercial tools for digital breast tomosynthesis and ultrasound, potentially able to reduce interpretation time and improve accuracy, are also available, but post-implementation evaluation studies are likewise lacking. Besides basic tools for breast MRI with limited proven clinical benefit, AI applications for other modalities are not yet commercially available. Applications in contrast-enhanced mammography are still in the research stage, especially for radiomics-based molecular subtype classification. Applications of Large Language Models (LLMs) are in their infancy, and there are currently no clinical applications. Consequently, and despite their promise, all commercially available AI tools for breast imaging should currently still be regarded as techniques that, at best, aid radiologists in image evaluation. Their use is therefore optional, and the findings may always be overruled. KEY POINTS: AI systems improve diagnostic accuracy and efficiency of mammography screening, but long-term outcomes data are lacking. Commercial tools for digital breast tomosynthesis and ultrasound are available, but post-implementation evaluation studies are lacking. AI tools for breast imaging should still be regarded as a non-obligatory aid to radiologists for image interpretation.

Quantitative Evaluation of AI-based Organ Segmentation Across Multiple Anatomical Sites Using Eight Commercial Software Platforms.

Yuan L, Chen Q, Al-Hallaq H, Yang J, Yang X, Geng H, Latifi K, Cai B, Wu QJ, Xiao Y, Benedict SH, Rong Y, Buchsbaum J, Qi XS

pubmed logopapersAug 23 2025
To evaluate organs-at-risk (OARs) segmentation variability across eight commercial AI-based segmentation software using independent multi-institutional datasets, and to provide recommendations for clinical practices utilizing AI-segmentation. 160 planning CT image sets from four anatomical sites: head-and-neck, thorax, abdomen and pelvis were retrospectively pooled from three institutions. Contours for 31 OARs generated by the software were compared to clinical contours using multiple accuracy metrics, including: Dice similarity coefficient (DSC), 95 Percentile of Hausdorff distance (HD95), surface DSC, as well as relative added path length (RAPL) as an efficiency metric. A two-factor analysis of variance was used to quantify variability in contouring accuracy across software platforms (inter-software) and patients (inter-patient). Pairwise comparisons were performed to categorize the software into different performance groups, and inter-software variations (ISV) were calculated as the average performance differences between the groups. Significant inter-software and inter-patient contouring accuracy variations (p<0.05) were observed for most OARs. The largest ISV in DSC in each anatomical region were cervical esophagus (0.41), trachea (0.10), spinal cord (0.13) and prostate (0.17). Among the organs evaluated, 7 had mean DSC >0.9 (i.e., heart, liver), 15 had DSC ranging from 0.7 to 0.89 (i.e., parotid, esophagus). The remaining organs (i.e., optic nerves, seminal vesicle) had DSC<0.7. 16 of the 31 organs (52%) had RAPL less than 0.1. Our results reveal significant inter-software and inter-patient variability in the performance of AI-segmentation software. These findings highlight the need of thorough software commissioning, testing, and quality assurance across disease sites, patient-specific anatomies and image acquisition protocols.

Economic Evaluations and Equity in the Use of Artificial Intelligence in Imaging Examinations for Medical Diagnosis in People With Dermatological, Neurological, and Pulmonary Diseases: Systematic Review.

Santana GO, Couto RM, Loureiro RM, Furriel BCRS, de Paula LGN, Rother ET, de Paiva JPQ, Correia LR

pubmed logopapersAug 13 2025
Health care systems around the world face numerous challenges. Recent advances in artificial intelligence (AI) have offered promising solutions, particularly in diagnostic imaging. This systematic review focused on evaluating the economic feasibility of AI in real-world diagnostic imaging scenarios, specifically for dermatological, neurological, and pulmonary diseases. The central question was whether the use of AI in these diagnostic assessments improves economic outcomes and promotes equity in health care systems. This systematic review has 2 main components, economic evaluation and equity assessment. We used the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) tool to ensure adherence to best practices in systematic reviews. The protocol was registered with PROSPERO (International Prospective Register of Systematic Reviews), and we followed the PRISMA-E (Preferred Reporting Items for Systematic Reviews and Meta-Analyses - Equity Extension) guidelines for equity. Scientific articles reporting on economic evaluations or equity considerations related to the use of AI-based tools in diagnostic imaging in dermatology, neurology, or pulmonology were included in the study. The search was conducted in the PubMed, Embase, Scopus, and Web of Science databases. Methodological quality was assessed using the following checklists, CHEC (Consensus on Health Economic Criteria) for economic evaluations, EPHPP (Effective Public Health Practice Project) for equity evaluation studies, and Welte for transferability. The systematic review identified 9 publications within the scope of the research question, with sample sizes ranging from 122 to over 1.3 million participants. The majority of studies addressed economic evaluation (88.9%), with most studies addressing pulmonary diseases (n=6; 66.6%), followed by neurological diseases (n=2; 22.3%), and only 1 (11.1%) study addressing dermatological diseases. These studies had an average quality access of 87.5% on the CHEC checklist. Only 2 studies were found to be transferable to Brazil and other countries with a similar health context. The economic evaluation revealed that 87.5% of studies highlighted the benefits of using AI in dermatology, neurology, and pulmonology, highlighting significant cost-effectiveness outcomes, with the most advantageous being a negative cost-effectiveness ratio of -US $27,580 per QALY (quality-adjusted life year) for melanoma diagnosis, indicating substantial cost savings in this scenario. The only study assessing equity, based on 129,819 radiographic images, identified AI-assisted underdiagnosis, particularly in certain subgroups defined by gender, ethnicity, and socioeconomic status. This review underscores the importance of transparency in the description of AI tools and the representativeness of population subgroups to mitigate health disparities. As AI is rapidly being integrated into health care, detailed assessments are essential to ensure that benefits reach all patients, regardless of sociodemographic factors.
Page 1 of 440 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.