Sort by:
Page 14 of 18173 results

Federated learning in low-resource settings: A chest imaging study in Africa -- Challenges and lessons learned

Jorge Fabila, Lidia Garrucho, Víctor M. Campello, Carlos Martín-Isla, Karim Lekadir

arxiv logopreprintMay 20 2025
This study explores the use of Federated Learning (FL) for tuberculosis (TB) diagnosis using chest X-rays in low-resource settings across Africa. FL allows hospitals to collaboratively train AI models without sharing raw patient data, addressing privacy concerns and data scarcity that hinder traditional centralized models. The research involved hospitals and research centers in eight African countries. Most sites used local datasets, while Ghana and The Gambia used public ones. The study compared locally trained models with a federated model built across all institutions to evaluate FL's real-world feasibility. Despite its promise, implementing FL in sub-Saharan Africa faces challenges such as poor infrastructure, unreliable internet, limited digital literacy, and weak AI regulations. Some institutions were also reluctant to share model updates due to data control concerns. In conclusion, FL shows strong potential for enabling AI-driven healthcare in underserved regions, but broader adoption will require improvements in infrastructure, education, and regulatory support.

RADAR: Enhancing Radiology Report Generation with Supplementary Knowledge Injection

Wenjun Hou, Yi Cheng, Kaishuai Xu, Heng Li, Yan Hu, Wenjie Li, Jiang Liu

arxiv logopreprintMay 20 2025
Large language models (LLMs) have demonstrated remarkable capabilities in various domains, including radiology report generation. Previous approaches have attempted to utilize multimodal LLMs for this task, enhancing their performance through the integration of domain-specific knowledge retrieval. However, these approaches often overlook the knowledge already embedded within the LLMs, leading to redundant information integration and inefficient utilization of learned representations. To address this limitation, we propose RADAR, a framework for enhancing radiology report generation with supplementary knowledge injection. RADAR improves report generation by systematically leveraging both the internal knowledge of an LLM and externally retrieved information. Specifically, it first extracts the model's acquired knowledge that aligns with expert image-based classification outputs. It then retrieves relevant supplementary knowledge to further enrich this information. Finally, by aggregating both sources, RADAR generates more accurate and informative radiology reports. Extensive experiments on MIMIC-CXR, CheXpert-Plus, and IU X-ray demonstrate that our model outperforms state-of-the-art LLMs in both language quality and clinical accuracy

Mask of Truth: Model Sensitivity to Unexpected Regions of Medical Images.

Sourget T, Hestbek-Møller M, Jiménez-Sánchez A, Junchi Xu J, Cheplygina V

pubmed logopapersMay 20 2025
The development of larger models for medical image analysis has led to increased performance. However, it also affected our ability to explain and validate model decisions. Models can use non-relevant parts of images, also called spurious correlations or shortcuts, to obtain high performance on benchmark datasets but fail in real-world scenarios. In this work, we challenge the capacity of convolutional neural networks (CNN) to classify chest X-rays and eye fundus images while masking out clinically relevant parts of the image. We show that all models trained on the PadChest dataset, irrespective of the masking strategy, are able to obtain an area under the curve (AUC) above random. Moreover, the models trained on full images obtain good performance on images without the region of interest (ROI), even superior to the one obtained on images only containing the ROI. We also reveal a possible spurious correlation in the Chákṣu dataset while the performances are more aligned with the expectation of an unbiased model. We go beyond the performance analysis with the usage of the explainability method SHAP and the analysis of embeddings. We asked a radiology resident to interpret chest X-rays under different masking to complement our findings with clinical knowledge.

Advanced feature fusion of radiomics and deep learning for accurate detection of wrist fractures on X-ray images.

Saadh MJ, Hussain QM, Albadr RJ, Doshi H, Rekha MM, Kundlas M, Pal A, Rizaev J, Taher WM, Alwan M, Jawad MJ, Al-Nuaimi AMA, Farhood B

pubmed logopapersMay 20 2025
The aim of this study was to develop a hybrid diagnostic framework integrating radiomic and deep features for accurate and reproducible detection and classification of wrist fractures using X-ray images. A total of 3,537 X-ray images, including 1,871 fracture and 1,666 non-fracture cases, were collected from three healthcare centers. Radiomic features were extracted using the PyRadiomics library, and deep features were derived from the bottleneck layer of an autoencoder. Both feature modalities underwent reliability assessment via Intraclass Correlation Coefficient (ICC) and cosine similarity. Feature selection methods, including ANOVA, Mutual Information (MI), Principal Component Analysis (PCA), and Recursive Feature Elimination (RFE), were applied to optimize the feature set. Classifiers such as XGBoost, CatBoost, Random Forest, and a Voting Classifier were used to evaluate diagnostic performance. The dataset was divided into training (70%) and testing (30%) sets, and metrics such as accuracy, sensitivity, and AUC-ROC were used for evaluation. The combined radiomic and deep feature approach consistently outperformed standalone methods. The Voting Classifier paired with MI achieved the highest performance, with a test accuracy of 95%, sensitivity of 94%, and AUC-ROC of 96%. The end-to-end model achieved competitive results with an accuracy of 93% and AUC-ROC of 94%. SHAP analysis and t-SNE visualizations confirmed the interpretability and robustness of the selected features. This hybrid framework demonstrates the potential for integrating radiomic and deep features to enhance diagnostic performance for wrist and forearm fractures, providing a reliable and interpretable solution suitable for clinical applications.

Accuracy of segment anything model for classification of vascular stenosis in digital subtraction angiography.

Navasardyan V, Katz M, Goertz L, Zohranyan V, Navasardyan H, Shahzadi I, Kröger JR, Borggrefe J

pubmed logopapersMay 19 2025
This retrospective study evaluates the diagnostic performance of an optimized comprehensive multi-stage framework based on the Segment Anything Model (SAM), which we named Dr-SAM, for detecting and grading vascular stenosis in the abdominal aorta and iliac arteries using digital subtraction angiography (DSA). A total of 100 DSA examinations were conducted on 100 patients. The infrarenal abdominal aorta (AAI), common iliac arteries (CIA), and external iliac arteries (EIA) were independently evaluated by two experienced radiologists using a standardized 5-point grading scale. Dr-SAM analyzed the same DSA images, and its assessments were compared with the average stenosis grading provided by the radiologists. Diagnostic accuracy was evaluated using Cohen's kappa, specificity, sensitivity, and Wilcoxon signed-rank tests. Interobserver agreement between radiologists, which established the reference standard, was strong (Cohen's kappa: CIA right = 0.95, CIA left = 0.94, EIA right = 0.98, EIA left = 0.98, AAI = 0.79). Dr-SAM showed high agreement with radiologist consensus for CIA (κ = 0.93 right, 0.91 left), moderate agreement for EIA (κ = 0.79 right, 0.76 left), and fair agreement for AAI (κ = 0.70). Dr-SAM demonstrated excellent specificity (up to 1.0) and robust sensitivity (0.67-0.83). Wilcoxon tests revealed no significant differences between Dr-SAM and radiologist grading (p > 0.05). Dr-SAM proved to be an accurate and efficient tool for vascular assessment, with the potential to streamline diagnostic workflows and reduce variability in stenosis grading. Its ability to deliver rapid and consistent evaluations may contribute to earlier detection of disease and the optimization of treatment strategies. Further studies are needed to confirm these findings in prospective settings and to enhance its capabilities, particularly in the detection of occlusions.

Detection of carotid artery calcifications using artificial intelligence in dental radiographs: a systematic review and meta-analysis.

Arzani S, Soltani P, Karimi A, Yazdi M, Ayoub A, Khurshid Z, Galderisi D, Devlin H

pubmed logopapersMay 19 2025
Carotid artery calcifications are important markers of cardiovascular health, often associated with atherosclerosis and a higher risk of stroke. Recent research shows that dental radiographs can help identify these calcifications, allowing for earlier detection of vascular diseases. Advances in artificial intelligence (AI) have improved the ability to detect carotid calcifications in dental images, making it a useful screening tool. This systematic review and meta-analysis aimed to evaluate how accurately AI methods can identify carotid calcifications in dental radiographs. A systematic search in databases including PubMed, Scopus, Embase, and Web of Science for studies on AI algorithms used to detect carotid calcifications in dental radiographs was conducted. Two independent reviewers collected data on study aims, imaging techniques, and statistical measures such as sensitivity and specificity. A meta-analysis using random effects was performed, and the risk of bias was evaluated with the QUADAS-2 tool. Nine studies were suitable for qualitative analysis, while five provided data for quantitative analysis. These studies assessed AI algorithms using cone beam computed tomography (n = 3) and panoramic radiographs (n = 6). The sensitivity of the included studies ranged from 0.67 to 0.98 and specificity varied between 0.85 and 0.99. The overall effect size, by considering only one AI method in each study, resulted in a sensitivity of 0.92 [95% CI 0.81 to 0.97] and a specificity of 0.96 [95% CI 0.92 to 0.97]. The high sensitivity and specificity indicate that AI methods could be effective screening tools, enhancing the early detection of stroke and related cardiovascular risks. Not applicable.

CorBenchX: Large-Scale Chest X-Ray Error Dataset and Vision-Language Model Benchmark for Report Error Correction

Jing Zou, Qingqiu Li, Chenyu Lian, Lihao Liu, Xiaohan Yan, Shujun Wang, Jing Qin

arxiv logopreprintMay 17 2025
AI-driven models have shown great promise in detecting errors in radiology reports, yet the field lacks a unified benchmark for rigorous evaluation of error detection and further correction. To address this gap, we introduce CorBenchX, a comprehensive suite for automated error detection and correction in chest X-ray reports, designed to advance AI-assisted quality control in clinical practice. We first synthesize a large-scale dataset of 26,326 chest X-ray error reports by injecting clinically common errors via prompting DeepSeek-R1, with each corrupted report paired with its original text, error type, and human-readable description. Leveraging this dataset, we benchmark both open- and closed-source vision-language models,(e.g., InternVL, Qwen-VL, GPT-4o, o4-mini, and Claude-3.7) for error detection and correction under zero-shot prompting. Among these models, o4-mini achieves the best performance, with 50.6 % detection accuracy and correction scores of BLEU 0.853, ROUGE 0.924, BERTScore 0.981, SembScore 0.865, and CheXbertF1 0.954, remaining below clinical-level accuracy, highlighting the challenge of precise report correction. To advance the state of the art, we propose a multi-step reinforcement learning (MSRL) framework that optimizes a multi-objective reward combining format compliance, error-type accuracy, and BLEU similarity. We apply MSRL to QwenVL2.5-7B, the top open-source model in our benchmark, achieving an improvement of 38.3% in single-error detection precision and 5.2% in single-error correction over the zero-shot baseline.

Prediction of cervical spondylotic myelopathy from a plain radiograph using deep learning with convolutional neural networks.

Tachi H, Kokabu T, Suzuki H, Ishikawa Y, Yabu A, Yanagihashi Y, Hyakumachi T, Shimizu T, Endo T, Ohnishi T, Ukeba D, Sudo H, Yamada K, Iwasaki N

pubmed logopapersMay 17 2025
This study aimed to develop deep learning algorithms (DLAs) utilising convolutional neural networks (CNNs) to classify cervical spondylotic myelopathy (CSM) and cervical spondylotic radiculopathy (CSR) from plain cervical spine radiographs. Data from 300 patients (150 with CSM and 150 with CSR) were used for internal validation (IV) using five-fold cross-validation strategy. Additionally, 100 patients (50 with CSM and 50 with CSR) were included in the external validation (EV). Two DLAs were trained using CNNs on plain radiographs from C3-C6 for the binary classification of CSM and CSR, and for the prediction of the spinal canal area rate using magnetic resonance imaging. Model performance was evaluated on external data using metrics such as area under the curve (AUC), accuracy, and likelihood ratios. For the binary classification, the AUC ranged from 0.84 to 0.96, with accuracy between 78% and 95% during IV. In the EV, the AUC and accuracy were 0.96 and 90%, respectively. For the spinal canal area rate, correlation coefficients during five-fold cross-validation ranged from 0.57 to 0.64, with a mean correlation of 0.61 observed in the EV. DLAs developed with CNNs demonstrated promising accuracy for classifying CSM and CSR from plain radiographs. These algorithms have the potential to assist non-specialists in identifying patients who require further evaluation or referral to spine specialists, thereby reducing delays in the diagnosis and treatment of CSM.

Artificial intelligence-guided distal radius fracture detection on plain radiographs in comparison with human raters.

Ramadanov N, John P, Hable R, Schreyer AG, Shabo S, Prill R, Salzmann M

pubmed logopapersMay 16 2025
The aim of this study was to compare the performance of artificial intelligence (AI) in detecting distal radius fractures (DRFs) on plain radiographs with the performance of human raters. We retrospectively analysed all wrist radiographs taken in our hospital since the introduction of AI-guided fracture detection from 11 September 2023 to 10 September 2024. The ground truth was defined by the radiological report of a board-certified radiologist based solely on conventional radiographs. The following parameters were calculated: True Positives (TP), True Negatives (TN), False Positives (FP), and False Negatives (FN), accuracy (%), Cohen's Kappa coefficient, F1 score, sensitivity (%), specificity (%), Youden Index (J Statistic). In total 1145 plain radiographs of the wrist were taken between 11 September 2023 and 10 September 2024. The mean age of the included patients was 46.6 years (± 27.3), ranging from 2 to 99 years and 59.0% were female. According to the ground truth, of the 556 anteroposterior (AP) radiographs, 225 cases (40.5%) had a DRF, and of the 589 lateral view radiographs, 240 cases (40.7%) had a DRF. The AI system showed the following results on AP radiographs: accuracy (%): 95.90; Cohen's Kappa: 0.913; F1 score: 0.947; sensitivity (%): 92.02; specificity (%): 98.45; Youden Index: 90.47. The orthopedic surgeon achieved a sensitivity of 91.5%, specificity of 97.8%, an overall accuracy of 95.1%, F1 score of 0.943, and Cohen's kappa of 0.901. These results were comparable to those of the AI model. AI-guided detection of DRF demonstrated diagnostic performance nearly identical to that of an experienced orthopedic surgeon across all key metrics. The marginal differences observed in sensitivity and specificity suggest that AI can reliably support clinical fracture assessment based solely on conventional radiographs.

Artificial intelligence in dentistry: awareness among dentists and computer scientists.

Costa ED, Vieira MA, Ambrosano GMB, Gaêta-Araujo H, Carneiro JA, Zancan BAG, Scaranti A, Macedo AA, Tirapelli C

pubmed logopapersMay 16 2025
For clinical application of artificial intelligence (AI) in dentistry, collaboration with computer scientists is necessary. This study aims to evaluate the knowledge of dentists and computer scientists regarding the utilization of AI in dentistry, especially in dentomaxillofacial radiology. 610 participants (374 dentists and 236 computer scientists) took part in a survey about AI in dentistry and radiographic imaging. Response options contained Likert scale of agreement/disagreement. Descriptive analyses of agreement scores were performed using quartiles (minimum value, first quartile, median, third quartile, and maximum value). Non-parametric Mann-Whitney test was used to compare response scores between two categories (α = 5%). Dentists academics had higher agreement scores for the questions: "knowing the applications of AI in dentistry", "dentists taking the lead in AI research", "AI education should be part of teaching", "AI can increase the price of dental services", "AI can lead to errors in radiographic diagnosis", "AI can negatively interfere with the choice of Radiology specialty", "AI can cause a reduction in the employment of radiologists", "patient data can be hacked using AI" (p < 0.05). Computer scientists had higher concordance scores for the questions "having knowledge in AI" and "AI's potential to speed up and improve radiographic diagnosis". Although dentists acknowledge the potential benefits of AI in dentistry, they remain skeptical about its use and consider it important to integrate the topic of AI into dental education curriculum. On the other hand, computer scientists confirm technical expertise in AI and recognize its potential in dentomaxillofacial radiology.
Page 14 of 18173 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.