Latest Papers on Radiology AI. Tags: Benchmark SOTA

Integrating Machine Learning into Myositis Research: a Systematic Review.

Juarez-Gomez C, Aguilar-Vazquez A, Gonzalez-Gauna E, Garcia-Ordoñez GP, Martin-Marquez BT, Gomez-Rios CA, Becerra-Jimenez J, Gaspar-Ruiz A, Vazquez-Del Mercado M

•papers•Jul 8 2025

Idiopathic inflammatory myopathies (IIM) are a group of autoimmune rheumatic diseases characterized by proximal muscle weakness and extra muscular manifestations. Since 1975, these IIM have been classified into different clinical phenotypes. Each clinical phenotype is associated with a better or worse prognosis and a particular physiopathology. Machine learning (ML) is a fascinating field of knowledge with worldwide applications in different fields. In IIM, ML is an emerging tool assessed in very specific clinical contexts as a complementary tool for research purposes, including transcriptome profiles in muscle biopsies, differential diagnosis using magnetic resonance imaging (MRI), and ultrasound (US). With the cancer-associated risk and predisposing factors for interstitial lung disease (ILD) development, this systematic review evaluates 23 original studies using supervised learning models, including logistic regression (LR), random forest (RF), support vector machines (SVM), and convolutional neural networks (CNN), with performance assessed primarily through the area under the curve coupled with the receiver operating characteristic (AUC-ROC).

Mixed Modality Classification Musculoskeletal Review Concept Academic Lab Benchmark SOTA

A Meta-Analysis of the Diagnosis of Condylar and Mandibular Fractures Based on 3-dimensional Imaging and Artificial Intelligence.

Wang F, Jia X, Meiling Z, Oscandar F, Ghani HA, Omar M, Li S, Sha L, Zhen J, Yuan Y, Zhao B, Abdullah JY

•papers•Jul 8 2025

This article aims to review the literature, study the current situation of using 3D images and artificial intelligence-assisted methods to improve the rapid and accurate classification and diagnosis of condylar fractures and conduct a meta-analysis of mandibular fractures. Mandibular condyle fracture is a common fracture type in maxillofacial surgery. Accurate classification and diagnosis of condylar fractures are critical to developing an effective treatment plan. With the rapid development of 3-dimensional imaging technology and artificial intelligence (AI), traditional x-ray diagnosis is gradually replaced by more accurate technologies such as 3-dimensional computed tomography (CT). These emerging technologies provide more detailed anatomic information and significantly improve the accuracy and efficiency of condylar fracture diagnosis, especially in the evaluation and surgical planning of complex fractures. The application of artificial intelligence in medical imaging is further analyzed, especially the successful cases of fracture detection and classification through deep learning models. Although AI technology has demonstrated great potential in condylar fracture diagnosis, it still faces challenges such as data quality, model interpretability, and clinical validation. This article evaluates the accuracy and practicality of AI in diagnosing mandibular fractures through a systematic review and meta-analysis of the existing literature. The results show that AI-assisted diagnosis has high prediction accuracy in detecting condylar fractures and significantly improves diagnostic efficiency. However, more multicenter studies are still needed to verify the application of AI in different clinical settings to promote its widespread application in maxillofacial surgery.

CT Classification Meta Analysis In Silico Academic Lab Benchmark SOTA

A novel UNet-SegNet and vision transformer architectures for efficient segmentation and classification in medical imaging.

Tongbram S, Shimray BA, Singh LS

•papers•Jul 8 2025

Medical imaging has become an essential tool in the diagnosis and treatment of various diseases, and provides critical insights through ultrasound, MRI, and X-ray modalities. Despite its importance, challenges remain in the accurate segmentation and classification of complex structures owing to factors such as low contrast, noise, and irregular anatomical shapes. This study addresses these challenges by proposing a novel hybrid deep learning model that integrates the strengths of Convolutional Autoencoders (CAE), UNet, and SegNet architectures. In the preprocessing phase, a Convolutional Autoencoder is used to effectively reduce noise while preserving essential image details, ensuring that the images used for segmentation and classification are of high quality. The ability of CAE to denoise images while retaining critical features enhances the accuracy of the subsequent analysis. The developed model employs UNet for multiscale feature extraction and SegNet for precise boundary reconstruction, with Dynamic Feature Fusion integrated at each skip connection to dynamically weight and combine the feature maps from the encoder and decoder. This ensures that both global and local features are effectively captured, while emphasizing the critical regions for segmentation. To further enhance the model's performance, the Hybrid Emperor Penguin Optimizer (HEPO) was employed for feature selection, while the Hybrid Vision Transformer with Convolutional Embedding (HyViT-CE) was used for the classification task. This hybrid approach allows the model to maintain high accuracy across different medical imaging tasks. The model was evaluated using three major datasets: brain tumor MRI, breast ultrasound, and chest X-rays. The results demonstrate exceptional performance, achieving an accuracy of 99.92% for brain tumor segmentation, 99.67% for breast cancer detection, and 99.93% for chest X-ray classification. These outcomes highlight the ability of the model to deliver reliable and accurate diagnostics across various medical contexts, underscoring its potential as a valuable tool in clinical settings. The findings of this study will contribute to advancing deep learning applications in medical imaging, addressing existing research gaps, and offering a robust solution for improved patient care.

Mixed Modality Segmentation Methodology In Silico Academic Lab Benchmark SOTA

MTMedFormer: multi-task vision transformer for medical imaging with federated learning.

Nath A, Shukla S, Gupta P

•papers•Jul 8 2025

Deep learning has revolutionized medical imaging, improving tasks like image segmentation, detection, and classification, often surpassing human accuracy. However, the training of effective diagnostic models is hindered by two major challenges: the need for large datasets for each task and privacy laws restricting the sharing of medical data. Multi-task learning (MTL) addresses the first challenge by enabling a single model to perform multiple tasks, though convolution-based MTL models struggle with contextualizing global features. Federated learning (FL) helps overcome the second challenge by allowing models to train collaboratively without sharing data, but traditional methods struggle to aggregate stable feature maps due to the permutation-invariant nature of neural networks. To tackle these issues, we propose MTMedFormer, a transformer-based multi-task medical imaging model. We leverage the transformers' ability to learn task-agnostic features using a shared encoder and utilize task-specific decoders for robust feature extraction. By combining MTL with a hybrid loss function, MTMedFormer learns distinct diagnostic tasks in a synergistic manner. Additionally, we introduce a novel Bayesian federation method for aggregating multi-task imaging models. Our results show that MTMedFormer outperforms traditional single-task and MTL models on mammogram and pneumonia datasets, while our Bayesian federation method surpasses traditional methods in image segmentation.

Mixed Modality Segmentation Chest Methodology In Silico Academic Lab Benchmark SOTA

Vision Transformers-Based Deep Feature Generation Framework for Hydatid Cyst Classification in Computed Tomography Images.

Sagik M, Gumus A

•papers•Jul 8 2025

Hydatid cysts, caused by Echinococcus granulosus, form progressively enlarging fluid-filled cysts in organs like the liver and lungs, posing significant public health risks through severe complications or death. This study presents a novel deep feature generation framework utilizing vision transformer models (ViT-DFG) to enhance the classification accuracy of hydatid cyst types. The proposed framework consists of four phases: image preprocessing, feature extraction using vision transformer models, feature selection through iterative neighborhood component analysis, and classification, where the performance of the ViT-DFG model was evaluated and compared across different classifiers such as k-nearest neighbor and multi-layer perceptron (MLP). Both methods were evaluated independently to assess classification performance from different approaches. The dataset, comprising five cyst types, was analyzed for both five-class and three-class classification by grouping the cyst types into active, transition, and inactive categories. Experimental results showed that the proposed VIT-DFG method achieves higher accuracy than existing methods. Specifically, the ViT-DFG framework attained an overall classification accuracy of 98.10% for the three-class and 95.12% for the five-class classifications using 5-fold cross-validation. Statistical analysis through one-way analysis of variance (ANOVA), conducted to evaluate significant differences between models, confirmed significant differences between the proposed framework and individual vision transformer models ( <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>p</mi> <mo><</mo> <mn>0.05</mn></mrow> </math> ). These results highlight the effectiveness of combining multiple vision transformer architectures with advanced feature selection techniques in improving classification performance. The findings underscore the ViT-DFG framework's potential to advance medical image analysis, particularly in hydatid cyst classification, while offering clinical promise through automated diagnostics and improved decision-making.

CT Classification Abdominal Methodology In Silico Academic Lab Benchmark SOTA

Noise-inspired diffusion model for generalizable low-dose CT reconstruction.

Gao Q, Chen Z, Zeng D, Zhang J, Ma J, Shan H

•papers•Jul 8 2025

The generalization of deep learning-based low-dose computed tomography (CT) reconstruction models to doses unseen in the training data is important and remains challenging. Previous efforts heavily rely on paired data to improve the generalization performance and robustness through collecting either diverse CT data for re-training or a few test data for fine-tuning. Recently, diffusion models have shown promising and generalizable performance in low-dose CT (LDCT) reconstruction, however, they may produce unrealistic structures due to the CT image noise deviating from Gaussian distribution and imprecise prior information from the guidance of noisy LDCT images. In this paper, we propose a noise-inspired diffusion model for generalizable LDCT reconstruction, termed NEED, which tailors diffusion models for noise characteristics of each domain. First, we propose a novel shifted Poisson diffusion model to denoise projection data, which aligns the diffusion process with the noise model in pre-log LDCT projections. Second, we devise a doubly guided diffusion model to refine reconstructed images, which leverages LDCT images and initial reconstructions to more accurately locate prior information and enhance reconstruction fidelity. By cascading these two diffusion models for dual-domain reconstruction, our NEED requires only normal-dose data for training and can be effectively extended to various unseen dose levels during testing via a time step matching strategy. Extensive qualitative, quantitative, and segmentation-based evaluations on two datasets demonstrate that our NEED consistently outperforms state-of-the-art methods in reconstruction and generalization performance. Source code is made available at https://github.com/qgao21/NEED.

CT Reconstruction Methodology In Silico Academic Lab Open Code Benchmark SOTA

External Validation of an Upgraded AI Model for Screening Ileocolic Intussusception Using Pediatric Abdominal Radiographs: Multicenter Retrospective Study.

Lee JH, Kim PH, Son NH, Han K, Kang Y, Jeong S, Kim EK, Yoon H, Gatidis S, Vasanawala S, Yoon HM, Shin HJ

•papers•Jul 8 2025

Artificial intelligence (AI) is increasingly used in radiology, but its development in pediatric imaging remains limited, particularly for emergent conditions. Ileocolic intussusception is an important cause of acute abdominal pain in infants and toddlers and requires timely diagnosis to prevent complications such as bowel ischemia or perforation. While ultrasonography is the diagnostic standard due to its high sensitivity and specificity, its accessibility may be limited, especially outside tertiary centers. Abdominal radiographs (AXRs), despite their limited sensitivity, are often the first-line imaging modality in clinical practice. In this context, AI could support early screening and triage by analyzing AXRs and identifying patients who require further ultrasonography evaluation. This study aimed to upgrade and externally validate an AI model for screening ileocolic intussusception using pediatric AXRs with multicenter data and to assess the diagnostic performance of the model in comparison with radiologists of varying experience levels with and without AI assistance. This retrospective study included pediatric patients (≤5 years) who underwent both AXRs and ultrasonography for suspected intussusception. Based on the preliminary study from hospital A, the AI model was retrained using data from hospital B and validated with external datasets from hospitals C and D. Diagnostic performance of the upgraded AI model was evaluated using sensitivity, specificity, and the area under the receiver operating characteristic curve (AUC). A reader study was conducted with 3 radiologists, including 2 trainees and 1 pediatric radiologist, to evaluate diagnostic performance with and without AI assistance. Based on the previously developed AI model trained on 746 patients from hospital A, an additional 431 patients from hospital B (including 143 intussusception cases) were used for further training to develop an upgraded AI model. External validation was conducted using data from hospital C (n=68; 19 intussusception cases) and hospital D (n=90; 30 intussusception cases). The upgraded AI model achieved a sensitivity of 81.7% (95% CI 68.6%-90%) and a specificity of 81.7% (95% CI 73.3%-87.8%), with an AUC of 86.2% (95% CI 79.2%-92.1%) in the external validation set. Without AI assistance, radiologists showed lower performance (overall AUC 64%; sensitivity 49.7%; specificity 77.1%). With AI assistance, radiologists' specificity improved to 93% (difference +15.9%; P<.001), and AUC increased to 79.2% (difference +15.2%; P=.05). The least experienced reader showed the largest improvement in specificity (+37.6%; P<.001) and AUC (+14.7%; P=.08). The upgraded AI model improved diagnostic performance for screening ileocolic intussusception on pediatric AXRs. It effectively enhanced the specificity and overall accuracy of radiologists, particularly those with less experience in pediatric radiology. A user-friendly software platform was introduced to support broader clinical validation and underscores the potential of AI as a screening and triage tool in pediatric emergency settings.

X-Ray Detection Abdominal Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Development and International Validation of a Deep Learning Model for Predicting Acute Pancreatitis Severity from CT Scans

Xu, Y., Teutsch, B., Zeng, W., Hu, Y., Rastogi, S., Hu, E. Y., DeGregorio, I. M., Fung, C. W., Richter, B. I., Cummings, R., Goldberg, J. E., Mathieu, E., Appiah Asare, B., Hegedus, P., Gurza, K.-B., Szabo, I. V., Tarjan, H., Szentesi, A., Borbely, R., Molnar, D., Faluhelyi, N., Vincze, A., Marta, K., Hegyi, P., Lei, Q., Gonda, T., Huang, C., Shen, Y.

•preprint•Jul 7 2025

Background and aimsAcute pancreatitis (AP) is a common gastrointestinal disease with rising global incidence. While most cases are mild, severe AP (SAP) carries high mortality. Early and accurate severity prediction is crucial for optimal management. However, existing severity prediction models, such as BISAP and mCTSI, have modest accuracy and often rely on data unavailable at admission. This study proposes a deep learning (DL) model to predict AP severity using abdominal contrast-enhanced CT (CECT) scans acquired within 24 hours of admission. MethodsWe collected 10,130 studies from 8,335 patients across a multi-site U.S. health system. The model was trained in two stages: (1) self-supervised pretraining on large-scale unlabeled CT studies and (2) fine-tuning on 550 labeled studies. Performance was evaluated against mCTSI and BISAP on a hold-out internal test set (n=100 patients) and externally validated on a Hungarian AP registry (n=518 patients). ResultsOn the internal test set, the model achieved AUROCs of 0.888 (95% CI: 0.800-0.960) for SAP and 0.888 (95% CI: 0.819-0.946) for mild AP (MAP), outperforming mCTSI (p = 0.002). External validation showed robust AUROCs of 0.887 (95% CI: 0.825-0.941) for SAP and 0.858 (95% CI: 0.826-0.888) for MAP, surpassing mCTSI (p = 0.024) and BISAP (p = 0.002). Retrospective simulation suggested the models potential to support admission triage and serve as a second reader during CECT interpretation. ConclusionsThe proposed DL model outperformed standard scoring systems for AP severity prediction, generalized well to external data, and shows promise for providing early clinical decision support and improving resource allocation.

CT Classification Abdominal Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Sequential Attention-based Sampling for Histopathological Analysis

Tarun G, Naman Malpani, Gugan Thoppe, Sridharan Devarajan

•preprint•Jul 7 2025

Deep neural networks are increasingly applied for automated histopathology. Yet, whole-slide images (WSIs) are often acquired at gigapixel sizes, rendering it computationally infeasible to analyze them entirely at high resolution. Diagnostic labels are largely available only at the slide-level, because expert annotation of images at a finer (patch) level is both laborious and expensive. Moreover, regions with diagnostic information typically occupy only a small fraction of the WSI, making it inefficient to examine the entire slide at full resolution. Here, we propose SASHA -- {\it S}equential {\it A}ttention-based {\it S}ampling for {\it H}istopathological {\it A}nalysis -- a deep reinforcement learning approach for efficient analysis of histopathological images. First, SASHA learns informative features with a lightweight hierarchical, attention-based multiple instance learning (MIL) model. Second, SASHA samples intelligently and zooms selectively into a small fraction (10-20\%) of high-resolution patches, to achieve reliable diagnosis. We show that SASHA matches state-of-the-art methods that analyze the WSI fully at high-resolution, albeit at a fraction of their computational and memory costs. In addition, it significantly outperforms competing, sparse sampling methods. We propose SASHA as an intelligent sampling model for medical imaging challenges that involve automated diagnosis with exceptionally large images containing sparsely informative features.

OCT Classification Methodology In Silico Academic Lab Benchmark SOTA

External validation of an artificial intelligence tool for fracture detection in children with osteogenesis imperfecta: a multireader study.

Pauling C, Laidlow-Singh H, Evans E, Garbera D, Williamson R, Fernando R, Thomas K, Martin H, Arthurs OJ, Shelmerdine SC

•papers•Jul 7 2025

To determine the performance of a commercially available AI tool for fracture detection when used in children with osteogenesis imperfecta (OI). All appendicular and pelvic radiographs from an OI clinic at a single centre from 48 patients were included. Seven radiologists evaluated anonymised images in two rounds, first without, then with AI assistance. Differences in diagnostic accuracy between the rounds were analysed. 48 patients (mean 12 years) provided 336 images, containing 206 fractures established by consensus opinion of two radiologists. AI produced a per-examination accuracy of 74.8% [95% CI: 65.4%, 82.7%], compared to average radiologist performance at 83.4% [95% CI: 75.2%, 89.8%]. Radiologists using AI assistance improved average radiologist accuracy per examination to 90.7% [95% CI: 83.5%, 95.4%]. AI gave more false negatives than radiologists, with 80 missed fractures versus 41, respectively. Radiologists were more likely (74.6%) to alter their original decision to agree with AI at the per-image level, 82.8% of which led to a correct result, 64.0% of which were changing from a false positive to a true negative. Despite inferior standalone performance, AI assistance can still improve radiologist fracture detection in a rare disease paediatric population. Radiologists using AI typically led to more accurate diagnostic outcomes through reduced false positives. Future studies focusing on the real-world application of AI tools in a larger population of children with bone fragility disorders will help better evaluate whether these improvements in accuracy translate into improved patient outcomes. Question How well does a commercially available artificial intelligence (AI) tool identify fractures, on appendicular radiographs of children with osteogenesis imperfecta (OI), and can it also improve radiologists' identification of fractures in this population? Findings Specialist human radiologists outperformed the AI fracture detection tool when acting alone; however, their diagnostic performance overall improved with AI assistance. Clinical relevance AI assistance improves specialist radiologist fracture detection in children with osteogenesis imperfecta, even with AI performance alone inferior to the radiologists acting alone. The reason for this was due to the AI moderating the number of false positives generated by the radiologists.

X-Ray Detection Musculoskeletal Retrospective Clinical Clinical Pilot Academic Lab Benchmark SOTA

Filter Papers

Tags

Integrating Machine Learning into Myositis Research: a Systematic Review.

A Meta-Analysis of the Diagnosis of Condylar and Mandibular Fractures Based on 3-dimensional Imaging and Artificial Intelligence.

A novel UNet-SegNet and vision transformer architectures for efficient segmentation and classification in medical imaging.

MTMedFormer: multi-task vision transformer for medical imaging with federated learning.

Vision Transformers-Based Deep Feature Generation Framework for Hydatid Cyst Classification in Computed Tomography Images.

Noise-inspired diffusion model for generalizable low-dose CT reconstruction.

External Validation of an Upgraded AI Model for Screening Ileocolic Intussusception Using Pediatric Abdominal Radiographs: Multicenter Retrospective Study.

Development and International Validation of a Deep Learning Model for Predicting Acute Pancreatitis Severity from CT Scans

Sequential Attention-based Sampling for Histopathological Analysis

External validation of an artificial intelligence tool for fracture detection in children with osteogenesis imperfecta: a multireader study.

Ready to Sharpen Your Edge?