Sort by:
Page 8 of 34334 results

Diagnostic Performance of Artificial Intelligence in Detecting and Distinguishing Pancreatic Ductal Adenocarcinoma via Computed Tomography: A Systematic Review and Meta-Analysis.

Harandi H, Gouravani M, Alikarami S, Shahrabi Farahani M, Ghavam M, Mohammadi S, Salehi MA, Reynolds S, Dehghani Firouzabadi F, Huda F

pubmed logopapersJul 18 2025
We conducted a systematic review and meta-analysis in diagnostic performance of studies that tried to use artificial intelligence (AI) algorithms in detecting pancreatic ductal adenocarcinoma (PDAC) and distinguishing them from other types of pancreatic lesions. We systematically searched for studies on pancreatic lesions and AI from January 2014 to May 2024. Data were extracted and a meta-analysis was performed using contingency tables and a random-effects model to calculate pooled sensitivity and specificity. Quality assessment was done using modified TRIPOD and PROBAST tools. We included 26 studies in this systematic review, with 22 studies chosen for meta-analysis. The evaluation of AI algorithms' performance in internal validation exhibited a pooled sensitivity of 93% (95% confidence interval [CI], 90 to 95) and specificity of 95% (95% CI, 92 to 97). Additionally, externally validated AI algorithms demonstrated a combined sensitivity of 89% (95% CI, 85 to 92) and specificity of 91% (95% CI, 85 to 95). Subgroup analysis indicated that diagnostic performance differed by comparator group, image contrast, segmentation technique, and algorithm type, with contrast-enhanced imaging and specific AI models (e.g., random forest for sensitivity and CNN for specificity) demonstrating superior accuracy. Although the potential biases should be further addressed, results of this systematic review and meta-analysis showed that AI models have the potential to be incorporated in clinical settings for the detection of smaller tumors and underpinning early signs of PDAC.

Deep learning-based automatic detection of pancreatic ductal adenocarcinoma ≤ 2 cm with high-resolution computed tomography: impact of the combination of tumor mass detection and indirect indicator evaluation.

Ozawa M, Sone M, Hijioka S, Hara H, Wakatsuki Y, Ishihara T, Hattori C, Hirano R, Ambo S, Esaki M, Kusumoto M, Matsui Y

pubmed logopapersJul 18 2025
Detecting small pancreatic ductal adenocarcinomas (PDAC) is challenging owing to their difficulty in being identified as distinct tumor masses. This study assesses the diagnostic performance of a three-dimensional convolutional neural network for the automatic detection of small PDAC using both automatic tumor mass detection and indirect indicator evaluation. High-resolution contrast-enhanced computed tomography (CT) scans from 181 patients diagnosed with PDAC (diameter ≤ 2 cm) between January 2018 and December 2023 were analyzed. The D/P ratio, which is the cross-sectional area of the MPD to that of the pancreatic parenchyma, was identified as an indirect indicator. A total of 204 patient data sets including 104 normal controls were analyzed for automatic tumor mass detection and D/P ratio evaluation. The sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) were evaluated to detect tumor mass. The sensitivity of PDAC detection was compared with that of the software and radiologists, and tumor localization accuracy was validated against endoscopic ultrasonography (EUS) findings. The sensitivity, specificity, PPV, and NPV for tumor mass detection were 77.0%, 76.0%, 75.5%, and 77.5%, respectively; for D/P ratio detection, 87.0%, 94.2%, 93.5%, and 88.3%, respectively; and for combined tumor mass and D/P ratio detections, 96.0%, 70.2%, 75.6%, and 94.8%, respectively. No significant difference was observed between the software's sensitivity and that of the radiologist's report (software, 96.0%; radiologist, 96.0%; p = 1). The concordance rate between software findings and EUS was 96.0%. Combining indirect indicator evaluation with tumor mass detection may improve small PDAC detection accuracy.

A Deep Learning-Based Ensemble System for Automated Shoulder Fracture Detection in Clinical Radiographs

Hemanth Kumar M, Karthika M, Saianiruth M, Vasanthakumar Venugopal, Anandakumar D, Revathi Ezhumalai, Charulatha K, Kishore Kumar J, Dayana G, Kalyan Sivasailam, Bargava Subramanian

arxiv logopreprintJul 17 2025
Background: Shoulder fractures are often underdiagnosed, especially in emergency and high-volume clinical settings. Studies report up to 10% of such fractures may be missed by radiologists. AI-driven tools offer a scalable way to assist early detection and reduce diagnostic delays. We address this gap through a dedicated AI system for shoulder radiographs. Methods: We developed a multi-model deep learning system using 10,000 annotated shoulder X-rays. Architectures include Faster R-CNN (ResNet50-FPN, ResNeXt), EfficientDet, and RF-DETR. To enhance detection, we applied bounding box and classification-level ensemble techniques such as Soft-NMS, WBF, and NMW fusion. Results: The NMW ensemble achieved 95.5% accuracy and an F1-score of 0.9610, outperforming individual models across all key metrics. It demonstrated strong recall and localization precision, confirming its effectiveness for clinical fracture detection in shoulder X-rays. Conclusion: The results show ensemble-based AI can reliably detect shoulder fractures in radiographs with high clinical relevance. The model's accuracy and deployment readiness position it well for integration into real-time diagnostic workflows. The current model is limited to binary fracture detection, reflecting its design for rapid screening and triage support rather than detailed orthopedic classification.

Collaborative Integration of AI and Human Expertise to Improve Detection of Chest Radiograph Abnormalities.

Awasthi A, Le N, Deng Z, Wu CC, Nguyen HV

pubmed logopapersJul 16 2025
<i>"Just Accepted" papers have undergone full peer review and have been accepted for publication in <i>Radiology: Artificial Intelligence</i>. This article will undergo copyediting, layout, and proof review before it is published in its final version. Please note that during production of the final copyedited article, errors may be discovered which could affect the content.</i> Purpose To develop a collaborative AI system that integrates eye gaze data and radiology reports to improve diagnostic accuracy in chest radiograph interpretation by identifying and correcting perceptual errors. Materials and Methods This retrospective study utilized public datasets REFLACX and EGD-CXR to develop a collaborative AI solution, named Collaborative Radiology Expert (CoRaX). It employs a large multimodal model to analyze image embeddings, eye gaze data, and radiology reports, aiming to rectify perceptual errors in chest radiology. The proposed system was evaluated using two simulated error datasets featuring random and uncertain alterations of five abnormalities. Evaluation focused on the system's referral-making process, the quality of referrals, and its performance within collaborative diagnostic settings. Results In the random masking-based error dataset, 28.0% (93/332) of abnormalities were altered. The system successfully corrected 21.3% (71/332) of these errors, with 6.6% (22/332) remaining unresolved. The accuracy of the system in identifying the correct regions of interest for missed abnormalities was 63.0% [95% CI: 59.0%, 68.0%], and 85.7% (240/280) of interactions with radiologists were deemed satisfactory, meaning that the system provided diagnostic aid to radiologists. In the uncertainty-masking-based error dataset, 43.9% (146/332) of abnormalities were altered. The system corrected 34.6% (115/332) of these errors, with 9.3% (31/332) unresolved. The accuracy of predicted regions of missed abnormalities for this dataset was 58.0% [95% CI: 55.0%, 62.0%], and 78.4% (233/297) of interactions were satisfactory. Conclusion The CoRaX system can collaborate efficiently with radiologists and address perceptual errors across various abnormalities in chest radiographs. ©RSNA, 2025.

Validation of artificial intelligence software for automatic calcium scoring in cardiac and chest computed tomography.

Hamelink II, Nie ZZ, Severijn TEJT, van Tuinen MM, van Ooijen PMAP, Kwee TCT, Dorrius MDM, van der Harst PP, Vliegenthart RR

pubmed logopapersJul 16 2025
Coronary artery calcium scoring (CACS), i.e. quantification of Agatston (AS) or volume score (VS), can be time consuming. The aim of this study was to compare automated, artificial intelligence (AI)-based CACS to manual scoring, in cardiac and chest CT for lung cancer screening. We selected 684 participants (59 ± 4.8 years; 48.8 % men) who underwent cardiac and non-ECG-triggered chest CT, including 484 participants with AS > 0 on cardiac CT. AI-based results were compared to manual AS and VS, by assessing sensitivity and accuracy, intraclass correlation coefficient (ICC), Bland-Altman analysis and Cohen's kappa for classification in AS strata (0;1-99;100-299;≥300). AI showed high CAC detection rate: 98.1% in cardiac CT (accuracy 97.1%) and 92.4% in chest CT (accuracy 92.1%). AI showed excellent agreement with manual AS (ICC:0.997 and 0.992) and manual VS (ICC:0.997 and 0.991), in cardiac CT and chest CT, respectively. In Bland-Altman analysis, there was a mean difference of 2.3 (limits of agreement (LoA):-42.7, 47.4) for AS on cardiac CT; 1.9 (LoA:-36.4, 40.2) for VS on cardiac CT; -0.3 (LoA:-74.8, 74.2) for AS on chest CT; and -0.6 (LoA:-65.7, 64.5) for VS on chest CT. Cohen's kappa was 0.952 (95%CI:0.934-0.970) for cardiac CT and 0.901 (95%CI:0.875-0.926) for chest CT, with concordance in 95.9 and 91.4% of cases, respectively. AI-based CACS shows high detection rate and strong correlation compared to manual CACS, with excellent risk classification agreement. AI may reduce evaluation time and enable opportunistic screening for CAC on low-dose chest CT.

Deep learning-assisted comparison of different models for predicting maxillary canine impaction on panoramic radiography.

Zhang C, Zhu H, Long H, Shi Y, Guo J, You M

pubmed logopapersJul 16 2025
The panoramic radiograph is the most commonly used imaging modality for predicting maxillary canine impaction. Several prediction models have been constructed based on panoramic radiographs. This study aimed to compare the prediction accuracy of existing models in an external validation facilitated by an automatic landmark detection system based on deep learning. Patients aged 7-14 years who underwent panoramic radiographic examinations and received a diagnosis of impacted canines were included in the study. An automatic landmark localization system was employed to assist the measurement of geometric parameters on the panoramic radiographs, followed by the calculated prediction of the canine impaction. Three prediction models constructed by Arnautska, Alqerban et al, and Margot et al were evaluated. The metrics of accuracy, sensitivity, specificity, precision, and area under the receiver operating characteristic curve (AUC) were used to compare the performance of different models. A total of 102 panoramic radiographs with 102 impacted canines and 102 nonimpacted canines were analyzed in this study. The prediction outcomes indicated that the model by Margot et al achieved the highest performance, with a sensitivity of 95% and a specificity of 86% (AUC, 0.97), followed by the model by Arnautska, with a sensitivity of 93% and a specificity of 71% (AUC, 0.94). The model by Alqerban et al showed poor performance with an AUC of only 0.20. Two of the existing predictive models exhibited good diagnostic accuracy, whereas the third model demonstrated suboptimal performance. Nonetheless, even the most effective model is constrained by several limitations, such as logical and computational challenges, which necessitate further refinement.

Comparative study of 2D vs. 3D AI-enhanced ultrasound for fetal crown-rump length evaluation in the first trimester.

Zhang Y, Huang Y, Chen C, Hu X, Pan W, Luo H, Huang Y, Wang H, Cao Y, Yi Y, Xiong Y, Ni D

pubmed logopapersJul 16 2025
Accurate fetal growth evaluation is crucial for monitoring fetal health, with crown-rump length (CRL) being the gold standard for estimating gestational age and assessing growth during the first trimester. To enhance CRL evaluation accuracy and efficiency, we developed an artificial intelligence (AI)-based model (3DCRL-Net) using the 3D U-Net architecture for automatic landmark detection to achieve CRL plane localization and measurement in 3D ultrasound. We then compared its performance to that of experienced radiologists using both 2D and 3D ultrasound for fetal growth assessment. This prospective consecutive study collected fetal data from 1,326 ultrasound screenings conducted at 11-14 weeks of gestation (June 2021 to June 2023). Three experienced radiologists performed fetal screening using 2D video (2D-RAD) and 3D volume (3D-RAD) to obtain the CRL plane and measurement. The 3DCRL-Net model automatically outputs the landmark position, CRL plane localization and measurement. Three specialists audited the planes achieved by radiologists and 3DCRL-Net as standard or non-standard. The performance of CRL landmark detection, plane localization, measurement and time efficiency was evaluated in the internal testing dataset, comparing results with 3D-RAD. In the external dataset, CRL plane localization, measurement accuracy, and time efficiency were compared among the three groups. The internal dataset consisted of 126 cases in the testing set (training: validation: testing = 8:1:1), and the external dataset included 245 cases. On the internal testing set, 3DCRL-Net achieved a mean absolute distance error of 1.81 mm for the nine landmarks, higher accuracy in standard plane localization compared to 3D-RAD (91.27% vs. 80.16%), and strong consistency in CRL measurements (mean absolute error (MAE): 1.26 mm; mean difference: 0.37 mm, P = 0.70). The average time required per fetal case was 2.02 s for 3DCRL-Net versus 2 min for 3D-RAD (P < 0.001). On the external testing dataset, 3DCRL-Net demonstrated high performance in standard plane localization, achieving results comparable to 2D-RAD and 3D-RAD (accuracy: 91.43% vs. 93.06% vs. 86.12%), with strong consistency in CRL measurements, compared to 2D-RAD, which showed an MAE of 1.58 mm and a mean difference of 1.12 mm (P = 0.25). For 2D-RAD vs. 3DCRL-Net, the Pearson correlation and R² were 0.96 and 0.93, respectively, with an MAE of 0.11 ± 0.12 weeks. The average time required per fetal case was 5 s for 3DCRL-Net, compared to 2 min for 3D-RAD and 35 s for 2D-RAD (P < 0.001). The 3DCRL-Net model provides a rapid, accurate, and fully automated solution for CRL measurement in 3D ultrasound, achieving expert-level performance and significantly improving the efficiency and reliability of first-trimester fetal growth assessment.

Deep learning for appendicitis: development of a three-dimensional localization model on CT.

Takaishi T, Kawai T, Kokubo Y, Fujinaga T, Ojio Y, Yamamoto T, Hayashi K, Owatari Y, Ito H, Hiwatashi A

pubmed logopapersJul 16 2025
To develop and evaluate a deep learning model for detecting appendicitis on abdominal CT. This retrospective single-center study included 567 CTs of appendicitis patients (330 males, age range 20-96) obtained between 2011 and 2020, randomly split into training (n = 517) and validation (n = 50) sets. The validation set was supplemented with 50 control CTs performed for acute abdomen. For a test dataset, 100 appendicitis CTs and 100 control CTs were consecutively collected from a separate period after 2021. Exclusion criteria included age < 20, perforation, unclear appendix, and appendix tumors. Appendicitis CTs were annotated with three-dimensional bounding boxes that encompassed inflamed appendices. CT protocols were unenhanced, 5-mm slice-thickness, 512 × 512 pixel matrix. The deep learning algorithm was based on faster region convolutional neural network (Faster R-CNN). Two board-certified radiologists visually graded model predictions on the test dataset using a 5-point Likert scale (0: no detection, 1: false, 2: poor, 3: fair, 4: good), with scores ≥ 3 considered true positives. Inter-rater agreement was assessed using weighted kappa statistics. The effects of intra-abdominal fat, periappendiceal fat-stranding, presence of appendicolith, and appendix diameter on the model's recall were analyzed using binary logistic regression. The model showed a precision of 0.66 (87/132), a recall of 0.87 (87/100), and a false-positive rate per patient of 0.23 (45/200). The inter-rater agreement for Likert scores of 2-4 was κ = 0.76. The logistic regression analysis showed that only intra-abdominal fat had a significant impact on the model's precision (p = 0.02). We developed a model capable of detecting appendicitis on CT with a three-dimensional bounding box.

Fetal-Net: enhancing Maternal-Fetal ultrasound interpretation through Multi-Scale convolutional neural networks and Transformers.

Islam U, Ali YA, Al-Razgan M, Ullah H, Almaiah MA, Tariq Z, Wazir KM

pubmed logopapersJul 15 2025
Ultrasound imaging plays an important role in fetal growth and maternal-fetal health evaluation, but due to the complicated anatomy of the fetus and image quality fluctuation, its interpretation is quite challenging. Although deep learning include Convolution Neural Networks (CNNs) have been promising, they have largely been limited to one task or the other, such as the segmentation or detection of fetal structures, thus lacking an integrated solution that accounts for the intricate interplay between anatomical structures. To overcome these limitations, Fetal-Net-a new deep learning architecture that integrates Multi-Scale-CNNs and transformer layers-was developed. The model was trained on a large, expertly annotated set of more than 12,000 ultrasound images across different anatomical planes for effective identification of fetal structures and anomaly detection. Fetal-Net achieved excellent performance in anomaly detection, with precision (96.5%), accuracy (97.5%), and recall (97.8%) showed robustness factor against various imaging settings, making it a potent means of augmenting prenatal care through refined ultrasound image interpretation.

Motion artifacts and image quality in stroke MRI: associated factors and impact on AI and human diagnostic accuracy.

Krag CH, Müller FC, Gandrup KL, Andersen MB, Møller JM, Liu ML, Rud A, Krabbe S, Al-Farra L, Nielsen M, Kruuse C, Boesen MP

pubmed logopapersJul 15 2025
To assess the prevalence of motion artifacts and the factors associated with them in a cohort of suspected stroke patients, and to determine their impact on diagnostic accuracy for both AI and radiologists. This retrospective cross-sectional study included brain MRI scans of consecutive adult suspected stroke patients from a non-comprehensive Danish stroke center between January and April 2020. An expert neuroradiologist identified acute ischemic, hemorrhagic, and space-occupying lesions as references. Two blinded radiology residents rated MRI image quality and motion artifacts. The diagnostic accuracy of a CE-marked deep learning tool was compared to that of radiology reports. Multivariate analysis examined associations between patient characteristics and motion artifacts. 775 patients (68 years ± 16, 420 female) were included. Acute ischemic, hemorrhagic, and space-occupying lesions were found in 216 (27.9%), 12 (1.5%), and 20 (2.6%). Motion artifacts were present in 57 (7.4%). Increasing age (OR per decade, 1.60; 95% CI: 1.26, 2.09; p < 0.001) and limb motor symptoms (OR, 2.36; 95% CI: 1.32, 4.20; p = 0.003) were independently associated with motion artifacts in multivariate analysis. Motion artifacts significantly reduced the accuracy of detecting hemorrhage. This reduction was greater for the AI tool (from 88 to 67%; p < 0.001) than for radiology reports (from 100 to 93%; p < 0.001). Ischemic and space-occupying lesion detection was not significantly affected. Motion artifacts are common in suspected stroke patients, particularly in the elderly and patients with motor symptoms, reducing accuracy for hemorrhage detection by both AI and radiologists. Question Motion artifacts reduce the quality of MRI scans, but it is unclear which factors are associated with them and how they impact diagnostic accuracy. Findings Motion artifacts occurred in 7% of suspected stroke MRI scans, associated with higher patient age and motor symptoms, lowering hemorrhage detection by AI and radiologists. Clinical relevance Motion artifacts in stroke brain MRIs significantly reduce the diagnostic accuracy of human and AI detection of intracranial hemorrhages. Elderly patients and those with motor symptoms may benefit from a greater focus on motion artifact prevention and reduction.
Page 8 of 34334 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.