Sort by:
Page 32 of 35341 results

Artificial intelligence-guided distal radius fracture detection on plain radiographs in comparison with human raters.

Ramadanov N, John P, Hable R, Schreyer AG, Shabo S, Prill R, Salzmann M

pubmed logopapersMay 16 2025
The aim of this study was to compare the performance of artificial intelligence (AI) in detecting distal radius fractures (DRFs) on plain radiographs with the performance of human raters. We retrospectively analysed all wrist radiographs taken in our hospital since the introduction of AI-guided fracture detection from 11 September 2023 to 10 September 2024. The ground truth was defined by the radiological report of a board-certified radiologist based solely on conventional radiographs. The following parameters were calculated: True Positives (TP), True Negatives (TN), False Positives (FP), and False Negatives (FN), accuracy (%), Cohen's Kappa coefficient, F1 score, sensitivity (%), specificity (%), Youden Index (J Statistic). In total 1145 plain radiographs of the wrist were taken between 11 September 2023 and 10 September 2024. The mean age of the included patients was 46.6 years (± 27.3), ranging from 2 to 99 years and 59.0% were female. According to the ground truth, of the 556 anteroposterior (AP) radiographs, 225 cases (40.5%) had a DRF, and of the 589 lateral view radiographs, 240 cases (40.7%) had a DRF. The AI system showed the following results on AP radiographs: accuracy (%): 95.90; Cohen's Kappa: 0.913; F1 score: 0.947; sensitivity (%): 92.02; specificity (%): 98.45; Youden Index: 90.47. The orthopedic surgeon achieved a sensitivity of 91.5%, specificity of 97.8%, an overall accuracy of 95.1%, F1 score of 0.943, and Cohen's kappa of 0.901. These results were comparable to those of the AI model. AI-guided detection of DRF demonstrated diagnostic performance nearly identical to that of an experienced orthopedic surgeon across all key metrics. The marginal differences observed in sensitivity and specificity suggest that AI can reliably support clinical fracture assessment based solely on conventional radiographs.

Artificial intelligence algorithm improves radiologists' bone age assessment accuracy artificial intelligence algorithm improves radiologists' bone age assessment accuracy.

Chang TY, Chou TY, Jen IA, Yuh YS

pubmed logopapersMay 15 2025
Artificial intelligence (AI) algorithms can provide rapid and precise radiographic bone age (BA) assessment. This study assessed the effects of an AI algorithm on the BA assessment performance of radiologists, and evaluated how automation bias could affect radiologists. In this prospective randomized crossover study, six radiologists with varying levels of experience (senior, mi-level, and junior) assessed cases from a test set of 200 standard BA radiographs. The test set was equally divided into two subsets: datasets A and B. Each radiologist assessed BA independently without AI assistance (A- B-) and with AI assistance (A+ B+). We used the mean of assessments made by two experts as the ground truth for accuracy assessment; subsequently, we calculated the mean absolute difference (MAD) between the radiologists' BA predictions and ground-truth BA and evaluated the proportion of estimates for which the MAD exceeded one year. Additionally, we compared the radiologists' performance under conditions of early AI assistance with their performance under conditions of delayed AI assistance; the radiologists were allowed to reject AI interpretations. The overall accuracy of senior, mid-level, and junior radiologists improved significantly with AI assistance than without AI assistance (MAD: 0.74 vs. 0.46 years, p < 0.001; proportion of assessments for which MAD exceeded 1 year: 24.0% vs. 8.4%, p < 0.001). The proportion of improved BA predictions with AI assistance (16.8%) was significantly higher than that of less accurate predictions with AI assistance (2.3%; p < 0.001). No consistent timing effect was observed between conditions of early and delayed AI assistance. Most disagreements between radiologists and AI occurred over images for patients aged ≤8 years. Senior radiologists had more disagreements than other radiologists. The AI algorithm improved the BA assessment accuracy of radiologists with varying experience levels. Automation bias was prone to affect less experienced radiologists.

CheXGenBench: A Unified Benchmark For Fidelity, Privacy and Utility of Synthetic Chest Radiographs

Raman Dutt, Pedro Sanchez, Yongchen Yao, Steven McDonagh, Sotirios A. Tsaftaris, Timothy Hospedales

arxiv logopreprintMay 15 2025
We introduce CheXGenBench, a rigorous and multifaceted evaluation framework for synthetic chest radiograph generation that simultaneously assesses fidelity, privacy risks, and clinical utility across state-of-the-art text-to-image generative models. Despite rapid advancements in generative AI for real-world imagery, medical domain evaluations have been hindered by methodological inconsistencies, outdated architectural comparisons, and disconnected assessment criteria that rarely address the practical clinical value of synthetic samples. CheXGenBench overcomes these limitations through standardised data partitioning and a unified evaluation protocol comprising over 20 quantitative metrics that systematically analyse generation quality, potential privacy vulnerabilities, and downstream clinical applicability across 11 leading text-to-image architectures. Our results reveal critical inefficiencies in the existing evaluation protocols, particularly in assessing generative fidelity, leading to inconsistent and uninformative comparisons. Our framework establishes a standardised benchmark for the medical AI community, enabling objective and reproducible comparisons while facilitating seamless integration of both existing and future generative models. Additionally, we release a high-quality, synthetic dataset, SynthCheX-75K, comprising 75K radiographs generated by the top-performing model (Sana 0.6B) in our benchmark to support further research in this critical domain. Through CheXGenBench, we establish a new state-of-the-art and release our framework, models, and SynthCheX-75K dataset at https://raman1121.github.io/CheXGenBench/

Energy-Efficient AI for Medical Diagnostics: Performance and Sustainability Analysis of ResNet and MobileNet.

Rehman ZU, Hassan U, Islam SU, Gallos P, Boudjadar J

pubmed logopapersMay 15 2025
Artificial intelligence (AI) has transformed medical diagnostics by enhancing the accuracy of disease detection, particularly through deep learning models to analyze medical imaging data. However, the energy demands of training these models, such as ResNet and MobileNet, are substantial and often overlooked; however, researchers mainly focus on improving model accuracy. This study compares the energy use of these two models for classifying thoracic diseases using the well-known CheXpert dataset. We calculate power and energy consumption during training using the EnergyEfficientAI library. Results demonstrate that MobileNet outperforms ResNet by consuming less power and completing training faster, resulting in lower overall energy costs. This study highlights the importance of prioritizing energy efficiency in AI model development, promoting sustainable, eco-friendly approaches to advance medical diagnosis.

Explainability Through Human-Centric Design for XAI in Lung Cancer Detection

Amy Rafferty, Rishi Ramaesh, Ajitha Rajan

arxiv logopreprintMay 14 2025
Deep learning models have shown promise in lung pathology detection from chest X-rays, but widespread clinical adoption remains limited due to opaque model decision-making. In prior work, we introduced ClinicXAI, a human-centric, expert-guided concept bottleneck model (CBM) designed for interpretable lung cancer diagnosis. We now extend that approach and present XpertXAI, a generalizable expert-driven model that preserves human-interpretable clinical concepts while scaling to detect multiple lung pathologies. Using a high-performing InceptionV3-based classifier and a public dataset of chest X-rays with radiology reports, we compare XpertXAI against leading post-hoc explainability methods and an unsupervised CBM, XCBs. We assess explanations through comparison with expert radiologist annotations and medical ground truth. Although XpertXAI is trained for multiple pathologies, our expert validation focuses on lung cancer. We find that existing techniques frequently fail to produce clinically meaningful explanations, omitting key diagnostic features and disagreeing with radiologist judgments. XpertXAI not only outperforms these baselines in predictive accuracy but also delivers concept-level explanations that better align with expert reasoning. While our focus remains on explainability in lung cancer detection, this work illustrates how human-centric model design can be effectively extended to broader diagnostic contexts - offering a scalable path toward clinically meaningful explainable AI in medical diagnostics.

Total radius BMD correlates with the hip and lumbar spine BMD among post-menopausal patients with fragility wrist fracture in a machine learning model.

Ruotsalainen T, Panfilov E, Thevenot J, Tiulpin A, Saarakkala S, Niinimäki J, Lehenkari P, Valkealahti M

pubmed logopapersMay 14 2025
Osteoporosis screening should be systematic in the group of over 50-year-old females with a radius fracture. We tested a phantom combined with machine learning model and studied osteoporosis-related variables. This machine learning model for screening osteoporosis using plain radiographs requires further investigation in larger cohorts to assess its potential as a replacement for DXA measurements in settings where DXA is not available. The main purpose of this study was to improve osteoporosis screening, especially in post-menopausal patients with fragility wrist fractures. The secondary objective was to increase understanding of the connection between osteoporosis and aging, as well as other risk factors. We collected data on 83 females > 50 years old with a distal radius fracture treated at Oulu University Hospital in 2019-2020. The data included basic patient information, WHO FRAX tool, blood tests, X-ray imaging of the fractured wrist, and DXA scanning of the non-fractured forearm, both hips, and the lumbar spine. Machine learning was used in combination with a custom phantom. Eighty-five percent of the study population had osteopenia or osteoporosis. Only 28.4% of patients had increased bone resorption activity measured by ICTP values. Total radius BMD correlated with other osteoporosis-related variables (age r =  - 0.494, BMI r = 0.273, FRAX osteoporotic fracture risk r =  - 0.419, FRAX hip fracture risk r =  - 0.433, hip BMD r = 0.435, and lumbar spine BMD r = 0.645), but the ultra distal (UD) radius BMD did not. Our custom phantom combined with a machine learning model showed potential for screening osteoporosis, with the class-wise accuracies for "Osteoporotic vs. osteopenic & normal bone" of 76% and 75%, respectively. We suggest osteoporosis screening for all females over 50 years old with wrist fractures. We found that the total radius BMD correlates with the central BMD. Due to the limited sample size in the phantom and machine learning parts of the study, further research is needed to make a clinically useful tool for screening osteoporosis.

Synthetic Data-Enhanced Classification of Prevalent Osteoporotic Fractures Using Dual-Energy X-Ray Absorptiometry-Based Geometric and Material Parameters.

Quagliato L, Seo J, Hong J, Lee T, Chung YS

pubmed logopapersMay 14 2025
Bone fracture risk assessment for osteoporotic patients is essential for implementing early countermeasures and preventing discomfort and hospitalization. Current methodologies, such as Fracture Risk Assessment Tool (FRAX), provide a risk assessment over a 5- to 10-year period rather than evaluating the bone's current health status. The database was collected by Ajou University Medical Center from 2017 to 2021. It included 9,260 patients, aged 55 to 99, comprising 242 femur fracture (FX) cases and 9,018 non-fracture (NFX) cases. To model the association of the bone's current health status with prevalent FXs, three prediction algorithms-extreme gradient boosting (XGB), support vector machine, and multilayer perceptron-were trained using two-dimensional dual-energy X-ray absorptiometry (2D-DXA) analysis results and subsequently benchmarked. The XGB classifier, which proved most effective, was then further refined using synthetic data generated by the adaptive synthetic oversampler to balance the FX and NFX classes and enhance boundary sharpness for better classification accuracy. The XGB model trained on raw data demonstrated good prediction capabilities, with an area under the curve (AUC) of 0.78 and an F1 score of 0.71 on test cases. The inclusion of synthetic data improved classification accuracy in terms of both specificity and sensitivity, resulting in an AUC of 0.99 and an F1 score of 0.98. The proposed methodology demonstrates that current bone health can be assessed through post-processed results from 2D-DXA analysis. Moreover, it was also shown that synthetic data can help stabilize uneven databases by balancing majority and minority classes, thereby significantly improving classification performance.

Evaluation of an artificial intelligence noise reduction tool for conventional X-ray imaging - a visual grading study of pediatric chest examinations at different radiation dose levels using anthropomorphic phantoms.

Hultenmo M, Pernbro J, Ahlin J, Bonnier M, Båth M

pubmed logopapersMay 13 2025
Noise reduction tools developed with artificial intelligence (AI) may be implemented to improve image quality and reduce radiation dose, which is of special interest in the more radiosensitive pediatric population. The aim of the present study was to examine the effect of the AI-based intelligent noise reduction (INR) on image quality at different dose levels in pediatric chest radiography. Anteroposterior and lateral images of two anthropomorphic phantoms were acquired with both standard noise reduction and INR at different dose levels. In total, 300 anteroposterior and 420 lateral images were included. Image quality was evaluated by three experienced pediatric radiologists. Gradings were analyzed with visual grading characteristics (VGC) resulting in area under the VGC curve (AUC<sub>VGC</sub>) values and associated confidence intervals (CI). Image quality of different anatomical structures and overall clinical image quality were statistically significantly better in the anteroposterior INR images than in the corresponding standard noise reduced images at each dose level. Compared with reference anteroposterior images at a dose level of 100% with standard noise reduction, the image quality of the anteroposterior INR images was graded as significantly better at dose levels of ≥ 80%. Statistical significance was also achieved at lower dose levels for some structures. The assessments of the lateral images showed similar trends but with fewer significant results. The results of the present study indicate that the AI-based INR may potentially be used to improve image quality at a specific dose level or to reduce dose and maintain the image quality in pediatric chest radiography.

A deep learning sex-specific body composition ageing biomarker using dual-energy X-ray absorptiometry scan.

Lian J, Cai P, Huang F, Huang J, Vardhanabhuti V

pubmed logopapersMay 13 2025
Chronic diseases are closely linked to alterations in body composition, yet there is a need for reliable biomarkers to assess disease risk and progression. This study aimed to develop and validate a biological age indicator based on body composition derived from dual-energy X-ray absorptiometry (DXA) scans, offering a novel approach to evaluating health status and predicting disease outcomes. A deep learning model was trained on a reference population from the UK Biobank to estimate body composition biological age (BCBA). The model's performance was assessed across various groups, including individuals with typical and atypical body composition, those with pre-existing diseases, and those who developed diseases after DXA imaging. Key metrics such as c-index were employed to examine BCBA's diagnostic and prognostic potential for type 2 diabetes, major adverse cardiovascular events (MACE), atherosclerotic cardiovascular disease (ASCVD), and hypertension. Here we show that BCBA strongly correlates with chronic disease diagnoses and risk prediction. BCBA demonstrated significant associations with type 2 diabetes (odds ratio 1.08 for females and 1.04 for males, p < 0.0005), MACE (odds ratio 1.10 for females and 1.11 for males, p < 0.0005), ASCVD (odds ratio 1.07 for females and 1.10 for males, p < 0.0005), and hypertension (odds ratio 1.06 for females and 1.04 for males, p < 0.0005). It outperformed standard cardiovascular risk profiles in predicting MACE and ASCVD. BCBA is a promising biomarker for assessing chronic disease risk and progression, with potential to improve clinical decision-making. Its integration into routine health assessments could aid early disease detection and personalised interventions.

A Deep Learning-Driven Inhalation Injury Grading Assistant Using Bronchoscopy Images

Yifan Li, Alan W Pang, Jo Woon Chong

arxiv logopreprintMay 13 2025
Inhalation injuries present a challenge in clinical diagnosis and grading due to Conventional grading methods such as the Abbreviated Injury Score (AIS) being subjective and lacking robust correlation with clinical parameters like mechanical ventilation duration and patient mortality. This study introduces a novel deep learning-based diagnosis assistant tool for grading inhalation injuries using bronchoscopy images to overcome subjective variability and enhance consistency in severity assessment. Our approach leverages data augmentation techniques, including graphic transformations, Contrastive Unpaired Translation (CUT), and CycleGAN, to address the scarcity of medical imaging data. We evaluate the classification performance of two deep learning models, GoogLeNet and Vision Transformer (ViT), across a dataset significantly expanded through these augmentation methods. The results demonstrate GoogLeNet combined with CUT as the most effective configuration for grading inhalation injuries through bronchoscopy images and achieves a classification accuracy of 97.8%. The histograms and frequency analysis evaluations reveal variations caused by the augmentation CUT with distribution changes in the histogram and texture details of the frequency spectrum. PCA visualizations underscore the CUT substantially enhances class separability in the feature space. Moreover, Grad-CAM analyses provide insight into the decision-making process; mean intensity for CUT heatmaps is 119.6, which significantly exceeds 98.8 of the original datasets. Our proposed tool leverages mechanical ventilation periods as a novel grading standard, providing comprehensive diagnostic support.
Page 32 of 35341 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.