Latest Papers on Radiology AI. Sources: pubmed, Tags: X-Ray, Order: Best Match, Limit: 10.

Application of Artificial Intelligence in rheumatic disease classification: an example of ankylosing spondylitis severity inspection model.

Chen CW, Tsai HH, Yeh CY, Yang CK, Tsou HK, Leong PY, Wei JC

•papers•Dec 1 2025

The development of the Artificial Intelligence (AI)-based severity inspection model for ankylosing spondylitis (AS) could support health professionals to rapidly assess the severity of the disease, enhance proficiency, and reduce the demands of human resources. This paper aims to develop an AI-based severity inspection model for AS using patients' X-ray images and modified Stoke Ankylosing Spondylitis Spinal Score (mSASSS). The numerical simulation with AI is developed following the progress of data preprocessing, building and testing the model, and then the model. The training data is preprocessed by inviting three experts to check the X-ray images of 222 patients following the Gold Standard. The model is then developed through two stages, including keypoint detection and mSASSS evaluation. The two-stage AI-based severity inspection model for AS was developed to automatically detect spine points and evaluate mSASSS scores. At last, the data obtained from the developed model was compared with those from experts' assessment to analyse the accuracy of the model. The study was conducted in accordance with the ethical principles outlined in the Declaration of Helsinki. The spine point detection at the first stage achieved 1.57 micrometres in mean error distance with the ground truth, and the second stage of the classification network can reach 0.81 in mean accuracy. The model can correctly identify 97.4% patches belonging to mSASSS score 3, while those belonging to score 0 can still be classified into scores 1 or 2. The automatic severity inspection model for AS developed in this paper is accurate and can support health professionals in rapidly assessing the severity of AS, enhancing assessment proficiency, and reducing the demands of human resources.

X-Ray Segmentation Musculoskeletal Methodology In Silico Academic Lab

Enhancing Diagnostic Accuracy of Fresh Vertebral Compression Fractures With Deep Learning Models.

Li KY, Ye HB, Zhang YL, Huang JW, Li HL, Tian NF

•papers•Aug 15 2025

Retrospective study. The study aimed to develop and authenticated a deep learning model based on X-ray images to accurately diagnose fresh thoracolumbar vertebral compression fractures. In clinical practice, diagnosing fresh vertebral compression fractures often requires MRI. However, due to the scarcity of MRI resources and the high time and economic costs involved, some patients may not receive timely diagnosis and treatment. Using a deep learning model combined with X-rays for diagnostic assistance could potentially serve as an alternative to MRI. In this study, the main collection included X-ray images suspected of thoracolumbar vertebral compression fractures from the municipal shared database between December 2012 and February 2024. Deep learning models were constructed using frameworks of EfficientNet, MobileNet, and MnasNet, respectively. We conducted a preliminary evaluation of the deep learning model using the validation set. The diagnostic performance of the models was evaluated using metrics such as AUC value, accuracy, sensitivity, specificity, F1 score, precision, and ROC curve. Finally, the deep learning models were compared with evaluations from two spine surgeons of different experience levels on the control set. This study included a total of 3025 lateral X-ray images from 2224 patients. The data set was divided into a training set of 2388 cases, a validation set of 482 cases, and a control set of 155 cases. In the validation set, the three groups of DL models had accuracies of 83.0%, 82.4%, and 82.2%, respectively. The AUC values were 0.861, 0.852, and 0.865, respectively. In the control set, the accuracies of the three groups of DL models were 78.1%, 78.1%, and 80.7%, respectively, all higher than spinal surgeons and significantly higher than junior spine surgeon. This study developed deep learning models for detecting fresh vertebral compression fractures, demonstrating high accuracy.

X-Ray Detection Musculoskeletal Retrospective Clinical In Silico

Comparative evaluation of CAM methods for enhancing explainability in veterinary radiography.

Dusza P, Banzato T, Burti S, Bendazzoli M, Müller H, Wodzinski M

•papers•Aug 13 2025

Explainable Artificial Intelligence (XAI) encompasses a broad spectrum of methods that aim to enhance the transparency of deep learning models, with Class Activation Mapping (CAM) methods widely used for visual interpretability. However, systematic evaluations of these methods in veterinary radiography remain scarce. This study presents a comparative analysis of eleven CAM methods, including GradCAM, XGradCAM, ScoreCAM, and EigenCAM, on a dataset of 7362 canine and feline X-ray images. A ResNet18 model was chosen based on the specificity of the dataset and preliminary results where it outperformed other models. Quantitative and qualitative evaluations were performed to determine how well each CAM method produced interpretable heatmaps relevant to clinical decision-making. Among the techniques evaluated, EigenGradCAM achieved the highest mean score and standard deviation (SD) of 2.571 (SD = 1.256), closely followed by EigenCAM at 2.519 (SD = 1.228) and GradCAM++ at 2.512 (SD = 1.277), with methods such as FullGrad and XGradCAM achieving worst scores of 2.000 (SD = 1.300) and 1.858 (SD = 1.198) respectively. Despite variations in saliency visualization, no single method universally improved veterinarians' diagnostic confidence. While certain CAM methods provide better visual cues for some pathologies, they generally offered limited explainability and didn't substantially improve veterinarians' diagnostic confidence.

X-Ray Classification Methodology In Silico Reproducibility

PPEA: Personalized positioning and exposure assistant based on multi-task shared pose estimation transformer.

Zhao J, Liu J, Yang C, Tang H, Chen Y, Zhang Y

•papers•Aug 13 2025

Hand and foot digital radiography (DR) is an indispensable tool in medical imaging, with varying diagnostic requirements necessitating different hand and foot positionings. Accurate positioning is crucial for obtaining diagnostically valuable images. Furthermore, adjusting exposure parameters such as exposure area based on patient conditions helps minimize the likelihood of image retakes. We propose a personalized positioning and exposure assistant capable of automatically recognizing hand and foot positionings and recommending appropriate exposure parameters to achieve these objectives. The assistant comprises three modules: (1) Progressive Iterative Hand-Foot Tracker (PIHFT) to iteratively locate hands or feet in RGB images, providing the foundation for accurate pose estimation; (2) Multi-Task Shared Pose Estimation Transformer (MTSPET), a Transformer-based model that encompasses hand and foot estimation branches with similar network architectures, sharing a common backbone. MTSPET outperformed MediaPipe in the hand pose estimation task and successfully transferred this capability to the foot pose estimation task; (3) Domain Expertise-embedded Positioning and Exposure Assistant (DEPEA), which combines the key-point coordinates of hands and feet with specific positioning and exposure parameter requirements, capable of checking patient positioning and inferring exposure areas and Regions of Interest (ROIs) of Digital Automatic Exposure Control (DAEC). Additionally, two datasets were collected and used to train MTSPET. A preliminary clinical trial showed strong agreement between PPEA's outputs and manual annotations, indicating the system's effectiveness in typical clinical scenarios. The contributions of this study lay the foundation for personalized, patient-specific imaging strategies, ultimately enhancing diagnostic outcomes and minimizing the risk of errors in clinical settings.

X-Ray Detection Musculoskeletal Retrospective Clinical Clinical Pilot Academic Lab Benchmark SOTA

The performance of large language models in dentomaxillofacial radiology: a systematic review.

Liu Z, Nalley A, Hao J, H Ai QY, Kan Yeung AW, Tanaka R, Hung KF

•papers•Aug 12 2025

This study aimed to systematically review the current performance of large language models (LLMs) in dento-maxillofacial radiology (DMFR). Five electronic databases were used to identify studies that developed, fine-tuned, or evaluated LLMs for DMFR-related tasks. Data extracted included study purpose, LLM type, images/text source, applied language, dataset characteristics, input and output, performance outcomes, evaluation methods, and reference standards. Customized assessment criteria adapted from the TRIPOD-LLM reporting guideline were used to evaluate the risk-of-bias in the included studies specifically regarding the clarity of dataset origin, the robustness of performance evaluation methods, and the validity of the reference standards. The initial search yielded 1621 titles, and nineteen studies were included. These studies investigated the use of LLMs for tasks including the production and answering of DMFR-related qualification exams and educational questions (n = 8), diagnosis and treatment recommendations (n = 7), and radiology report generation and patient communication (n = 4). LLMs demonstrated varied performance in diagnosing dental conditions, with accuracy ranging from 37-92.5% and expert ratings for differential diagnosis and treatment planning between 3.6-4.7 on a 5-point scale. For DMFR-related qualification exams and board-style questions, LLMs achieved correctness rates between 33.3-86.1%. Automated radiology report generation showed moderate performance with accuracy ranging from 70.4-81.3%. LLMs demonstrate promising potential in DMFR, particularly for diagnostic, educational, and report generation tasks. However, their current accuracy, completeness, and consistency remain variable. Further development, validation, and standardization are needed before LLMs can be reliably integrated as supportive tools in clinical workflows and educational settings.

X-Ray LLM Radiology Report Review Prototype GenAI

Exploring GPT-4o's multimodal reasoning capabilities with panoramic radiograph: the role of prompt engineering.

Xiong YT, Lian WJ, Sun YN, Liu W, Guo JX, Tang W, Liu C

•papers•Aug 12 2025

The aim of this study was to evaluate GPT-4o's multimodal reasoning ability to review panoramic radiograph (PR) and verify its radiologic findings, while exploring the role of prompt engineering in enhancing its performance. The study included 230 PRs from West China Hospital of Stomatology in 2024, which were interpreted to generate the PR findings. A total of 300 instances of interpretation errors, were manually inserted into the PR findings. The ablation study was conducted to assess whether GPT-4o can perform reasoning on PR under a zero-shot prompt. Prompt engineering was employed to enhance the reasoning capabilities of GPT-4o in identifying interpretation errors with PRs. The prompt strategies included chain-of-thought, self-consistency, in-context learning, multimodal in-context learning, and their systematic integration into a meta-prompt. Recall, accuracy, and F1 score were employed to evaluate the outputs. Subsequently, the localization capability of GPT-4o and its influence on reasoning capability were evaluated. In the ablation study, GPT-4o's recall increased significantly from 2.67 to 43.33% upon acquiring PRs (P < 0.001). GPT-4o with the meta prompt demonstrated improvements in recall (43.33% vs. 52.67%, P = 0.022), accuracy (39.95% vs. 68.75%, P < 0.001), and F1 score (0.42 vs. 0.60, P < 0.001) compared to the zero-shot prompt and other prompt strategies. The localization accuracy of GPT-4o was 45.67% (137 out of 300, 95% CI: 40.00 to 51.34). A significant correlation was observed between its localization accuracy and reasoning capability under the meta prompt (φ coefficient = 0.33, p < 0.001). The model's recall increased by 5.49% (P = 0.031) by providing accurate localization cues within the meta prompt. GPT-4o demonstrated a certain degree of multimodal capability for PR, with performance enhancement through prompt engineering. Nevertheless, its performance remains inadequate for clinical requirements. Future efforts will be necessary to identify additional factors influencing the model's reasoning capability or to develop more advanced models. Evaluating GPT-4o's capability to interpret and reason through PRs and exploring potential methods to enhance its performance before clinical application in assisting radiological assessments.

X-Ray LLM Radiology Report Methodology In Silico Academic Lab GenAI

Leveraging an Image-Enhanced Cross-Modal Fusion Network for Radiology Report Generation.

Guo Y, Hou X, Liu Z, Zhang Y

•papers•Aug 11 2025

Radiology report generation (RRG) tasks leverage computer-aided technology to automatically produce descriptive text reports for medical images, aiming to ease radiologists' workload, reduce misdiagnosis rates, and lessen the pressure on medical resources. However, previous works have yet to focus on enhancing feature extraction of low-quality images, incorporating cross-modal interaction information, and mitigating latency in report generation. We propose an Image-Enhanced Cross-Modal Fusion Network (IFNet) for automatic RRG to tackle these challenges. IFNet includes three key components. First, the image enhancement module enhances the detailed representation of typical and atypical structures in X-ray images, thereby boosting detection success rates. Second, the cross-modal fusion networks efficiently and comprehensively capture the interactions of cross-modal features. Finally, a more efficient transformer report generation module is designed to optimize report generation efficiency while being suitable for low-resource devices. Experimental results on public datasets IU X-ray and MIMIC-CXR demonstrate that IFNet significantly outperforms the current state-of-the-art methods.

X-Ray Report Generation Chest Methodology In Silico

Pulmonary diseases accurate recognition using adaptive multiscale feature fusion in chest radiography.

Zhou M, Gao L, Bian K, Wang H, Wang N, Chen Y, Liu S

•papers•Aug 10 2025

Pulmonary disease can severely impair respiratory function and be life-threatening. Accurately recognizing pulmonary diseases in chest X-ray images is challenging due to overlapping body structures and the complex anatomy of the chest. We propose an adaptive multiscale feature fusion model for recognizing Chest X-ray images of pneumonia, tuberculosis, and COVID-19, which are common pulmonary diseases. We introduce an Adaptive Multiscale Fusion Network (AMFNet) for pulmonary disease classification in chest X-ray images. AMFNet consists of a lightweight Multiscale Fusion Network (MFNet) and ResNet50 as the secondary feature extraction network. MFNet employs Fusion Blocks with self-calibrated convolution (SCConv) and Attention Feature Fusion (AFF) to capture multiscale semantic features, and integrates a custom activation function, MFReLU, which is employed to reduce the model's memory access time. A fusion module adaptively combines features from both networks. Experimental results show that AMFNet achieves 97.48% accuracy and an F1 score of 0.9781 on public datasets, outperforming models like ResNet50, DenseNet121, ConvNeXt-Tiny, and Vision Transformer while using fewer parameters.

X-Ray Classification Chest Methodology In Silico Academic Lab

Parental and carer views on the use of AI in imaging for children: a national survey.

Agarwal G, Salami RK, Lee L, Martin H, Shantharam L, Thomas K, Ashworth E, Allan E, Yung KW, Pauling C, Leyden D, Arthurs OJ, Shelmerdine SC

•papers•Aug 9 2025

Although the use of artificial intelligence (AI) in healthcare is increasing, stakeholder engagement remains poor, particularly relating to understanding parent/carer acceptance of AI tools in paediatric imaging. We explore these perceptions and compare them to the opinions of children and young people (CYAP). A UK national online survey was conducted, inviting parents, carers and guardians of children to participate. The survey was "live" from June 2022 to 2023. The survey included questions asking about respondents' views of AI in general, as well as in specific circumstances (e.g. fractures) with respect to children's healthcare. One hundred forty-six parents/carers (mean age = 45; range = 21-80) from all four nations of the UK responded. Most respondents (93/146, 64%) believed that AI would be more accurate at interpreting paediatric musculoskeletal radiographs than healthcare professionals, but had a strong preference for human supervision (66%). Whilst male respondents were more likely to believe that AI would be more accurate (55/72, 76%), they were twice as likely as female parents/carers to believe that AI use could result in their child's data falling into the wrong hands. Most respondents would like to be asked permission before AI is used for the interpretation of their child's scans (104/146, 71%). Notably, 79% of parents/carers prioritised accuracy over speed compared to 66% of CYAP. Parents/carers feel positively about AI for paediatric imaging but strongly discourage autonomous use. Acknowledging the diverse opinions of the patient population is vital in aiding the successful integration of AI for paediatric imaging. Parents/carers demonstrate a preference for AI use with human supervision that prioritises accuracy, transparency and institutional accountability. AI is welcomed as a supportive tool, but not as a substitute for human expertise. Parents/carers are accepting of AI use, with human supervision. Over half believe AI would replace doctors/nurses looking at bone X-rays within 5 years. Parents/carers are more likely than CYAP to trust AI's accuracy. Parents/carers are also more sceptical about AI data misuse.

X-Ray Classification Musculoskeletal Retrospective Clinical In Silico Academic Lab Ethics

Deep Learning Chest X-Ray Age, Epigenetic Aging Clocks and Associations with Age-Related Subclinical Disease in the Project Baseline Health Study.

Chandra J, Short S, Rodriguez F, Maron DJ, Pagidipati N, Hernandez AF, Mahaffey KW, Shah SH, Kiel DP, Lu MT, Raghu VK

•papers•Aug 8 2025

Chronological age is an important component of medical risk scores and decision-making. However, there is considerable variability in how individuals age. We recently published an open-source deep learning model to assess biological age from chest radiographs (CXR-Age), which predicts all-cause and cardiovascular mortality better than chronological age. Here, we compare CXR-Age to two established epigenetic aging clocks (First generation-Horvath Age; Second generation-DNAm PhenoAge) to test which is more strongly associated with cardiopulmonary disease and frailty. Our cohort consisted of 2,097 participants from the Project Baseline Health Study, a prospective cohort study of individuals from four US sites. We compared the association between the different aging clocks and measures of cardiopulmonary disease, frailty, and protein abundance collected at the participant's first annual visit using linear regression models adjusted for common confounders. We found that CXR-Age was associated with coronary calcium, cardiovascular risk factors, worsening pulmonary function, increased frailty, and abundance in plasma of two proteins implicated in neuroinflammation and aging. Associations with DNAm PhenoAge were weaker for pulmonary function and all metrics in middle-age adults. We identified thirteen proteins that were associated with DNAm PhenoAge, one (CDH13) of which was also associated with CXR-Age. No associations were found with Horvath Age. These results suggest that CXR-Age may serve as a better metric of cardiopulmonary aging than epigenetic aging clocks, especially in midlife adults.

X-Ray Classification Chest Retrospective Clinical In Silico Academic Lab Open Code