Latest Papers on Radiology AI.

Towards Data-Efficient Medical Imaging: A Generative and Semi-Supervised Framework

Mosong Ma, Tania Stathaki, Michalis Lazarou

•preprint•Oct 7 2025

Deep learning in medical imaging is often limited by scarce and imbalanced annotated data. We present SSGNet, a unified framework that combines class specific generative modeling with iterative semisupervised pseudo labeling to enhance both classification and segmentation. Rather than functioning as a standalone model, SSGNet augments existing baselines by expanding training data with StyleGAN3 generated images and refining labels through iterative pseudo labeling. Experiments across multiple medical imaging benchmarks demonstrate consistent gains in classification and segmentation performance, while Frechet Inception Distance analysis confirms the high quality of generated samples. These results highlight SSGNet as a practical strategy to mitigate annotation bottlenecks and improve robustness in medical image analysis.

Mixed Modality Segmentation Methodology In Silico Academic Lab GenAI

Deformable Image Registration for Self-supervised Cardiac Phase Detection in Multi-View Multi-Disease Cardiac Magnetic Resonance Images

Sven Koehler, Sarah Kaye Mueller, Jonathan Kiekenap, Gerald Greil, Tarique Hussain, Samir Sarikouch, Florian André, Norbert Frey, Sandy Engelhardt

•preprint•Oct 7 2025

Cardiovascular magnetic resonance (CMR) is the gold standard for assessing cardiac function, but individual cardiac cycles complicate automatic temporal comparison or sub-phase analysis. Accurate cardiac keyframe detection can eliminate this problem. However, automatic methods solely derive end-systole (ES) and end-diastole (ED) frames from left ventricular volume curves, which do not provide a deeper insight into myocardial motion. We propose a self-supervised deep learning method detecting five keyframes in short-axis (SAX) and four-chamber long-axis (4CH) cine CMR. Initially, dense deformable registration fields are derived from the images and used to compute a 1D motion descriptor, which provides valuable insights into global cardiac contraction and relaxation patterns. From these characteristic curves, keyframes are determined using a simple set of rules. The method was independently evaluated for both views using three public, multicentre, multidisease datasets. M&Ms-2 (n=360) dataset was used for training and evaluation, and M&Ms (n=345) and ACDC (n=100) datasets for repeatability control. Furthermore, generalisability to patients with rare congenital heart defects was tested using the German Competence Network (GCN) dataset. Our self-supervised approach achieved improved detection accuracy by 30% - 51% for SAX and 11% - 47% for 4CH in ED and ES, as measured by cyclic frame difference (cFD), compared with the volume-based approach. We can detect ED and ES, as well as three additional keyframes throughout the cardiac cycle with a mean cFD below 1.31 frames for SAX and 1.73 for LAX. Our approach enables temporally aligned inter- and intra-patient analysis of cardiac dynamics, irrespective of cycle or phase lengths. GitHub repository: https://github.com/Cardio-AI/cmr-multi-view-phase-detection.git

MRI Detection Cardiac Methodology In Silico Academic Lab Open Code

Large Language Model-Based Uncertainty-Adjusted Label Extraction for Artificial Intelligence Model Development in Upper Extremity Radiography

Hanna Kreutzer, Anne-Sophie Caselitz, Thomas Dratsch, Daniel Pinto dos Santos, Christiane Kuhl, Daniel Truhn, Sven Nebelung

•preprint•Oct 7 2025

Objectives: To evaluate GPT-4o's ability to extract diagnostic labels (with uncertainty) from free-text radiology reports and to test how these labels affect multi-label image classification of musculoskeletal radiographs. Methods: This retrospective study included radiography series of the clavicle (n=1,170), elbow (n=3,755), and thumb (n=1,978). After anonymization, GPT-4o filled out structured templates by indicating imaging findings as present ("true"), absent ("false"), or "uncertain." To assess the impact of label uncertainty, "uncertain" labels of the training and validation sets were automatically reassigned to "true" (inclusive) or "false" (exclusive). Label-image-pairs were used for multi-label classification using ResNet50. Label extraction accuracy was manually verified on internal (clavicle: n=233, elbow: n=745, thumb: n=393) and external test sets (n=300 for each). Performance was assessed using macro-averaged receiver operating characteristic (ROC) area under the curve (AUC), precision recall curves, sensitivity, specificity, and accuracy. AUCs were compared with the DeLong test. Results: Automatic extraction was correct in 98.6% (60,618 of 61,488) of labels in the test sets. Across anatomic regions, label-based model training yielded competitive performance measured by macro-averaged AUC values for inclusive (e.g., elbow: AUC=0.80 [range, 0.62-0.87]) and exclusive models (elbow: AUC=0.80 [range, 0.61-0.88]). Models generalized well on external datasets (elbow [inclusive]: AUC=0.79 [range, 0.61-0.87]; elbow [exclusive]: AUC=0.79 [range, 0.63-0.89]). No significant differences were observed across labeling strategies or datasets (p>=0.15). Conclusion: GPT-4o extracted labels from radiologic reports to train competitive multi-label classification models with high accuracy. Detected uncertainty in the radiologic reports did not influence the performance of these models.

X-Ray Classification Musculoskeletal Retrospective Clinical In Silico GenAI

Shaken or Stirred? An Analysis of MetaFormer's Token Mixing for Medical Imaging

Ron Keuth, Paul Kaftan, Mattias P. Heinrich

•preprint•Oct 7 2025

The generalization of the Transformer architecture via MetaFormer has reshaped our understanding of its success in computer vision. By replacing self-attention with simpler token mixers, MetaFormer provides strong baselines for vision tasks. However, while extensively studied on natural image datasets, its use in medical imaging remains scarce, and existing works rarely compare different token mixers, potentially overlooking more suitable designs choices. In this work, we present the first comprehensive study of token mixers for medical imaging. We systematically analyze pooling-, convolution-, and attention-based token mixers within the MetaFormer architecture on image classification (global prediction task) and semantic segmentation (dense prediction task). Our evaluation spans eight datasets covering diverse modalities and common challenges in the medical domain. Given the prevalence of pretraining from natural images to mitigate medical data scarcity, we also examine transferring pretrained weights to new token mixers. Our results show that, for classification, low-complexity token mixers (e.g. grouped convolution or pooling) are sufficient, aligning with findings on natural images. Pretrained weights remain useful despite the domain gap introduced by the new token mixer. For segmentation, we find that the local inductive bias of convolutional token mixers is essential. Grouped convolutions emerge as the preferred choice, as they reduce runtime and parameter count compared to standard convolutions, while the MetaFormer's channel-MLPs already provide the necessary cross-channel interactions. Our code is available on GitHub.

Mixed Modality Segmentation Methodology In Silico Academic Lab Open Code

Modulated INR with Prior Embeddings for Ultrasound Imaging Reconstruction

Rémi Delaunay, Christoph Hennersperger, Stefan Wörz

•preprint•Oct 7 2025

Ultrafast ultrasound imaging enables visualization of rapid physiological dynamics by acquiring data at exceptionally high frame rates. However, this speed often comes at the cost of spatial resolution and image quality due to unfocused wave transmissions and associated artifacts. In this work, we propose a novel modulated Implicit Neural Representation (INR) framework that leverages a coordinate-based neural network conditioned on latent embeddings extracted from time-delayed I/Q channel data for high-quality ultrasound image reconstruction. Our method integrates complex Gabor wavelet activation and a conditioner network to capture the oscillatory and phase-sensitive nature of I/Q ultrasound signals. We evaluate the framework on an in vivo intracardiac echocardiography (ICE) dataset and demonstrate that it outperforms the compared state-of-the-art methods. We believe these findings not only highlight the advantages of INR-based modeling for ultrasound image reconstruction, but also point to broader opportunities for applying INR frameworks across other medical imaging modalities.

Ultrasound Reconstruction Cardiac Methodology In Silico

Amide proton transfer imaging reveals cerebral metabolic alterations associated with cognitive impairment in type 2 diabetes mellitus.

Jiang H, Yu S, Yu L, Du W, Yuan C, Cao J, Song Q, Liu T, Miao Y, Wang W

•papers•Oct 7 2025

Amide proton transfer (APT) imaging indirectly reflects tissue metabolic changes by detecting variations in the concentration of mobile amide protons and tissue pH. Type 2 diabetes mellitus (T2DM) is often accompanied by cognitive dysfunction and diabetic encephalopathy, both of which pose serious threat to human health and quality of life. This study aimed to evaluate the potential of APT imaging as a novel biomarker for detecting cerebral metabolic alterations and to investigate its associations with cognitive impairment in patients with T2DM. This study included 32 T2DM patients, comprising 16 with mild cognitive impairment (MCI) and 16 with normal cognition (NC), and 26 healthy controls. Clinical data and cognitive assessments were collected within one week of MRI acquisition. Imaging markers of cerebral small vessel disease (CSVD) were evaluated using AI-assisted tools. APT values were measured in predefined brain regions, including the hippocampus (hipp), temporal white matter (TWM), temporal gray matter (TGM), occipital white matter (OWM), occipital gray matter (OGM), and cerebral peduncles (CPs) using 3D Slicer software. Group differences were analyzed with one-way ANOVA followed by Bonferroni-corrected post hoc tests (or Kruskal-Wallis test with Bonferroni correction for non-parametric data). Partial correlations (Bonferroni-corrected) assessed the links between APT values and cognitive scores, as well as between APT values and CSVD imaging markers. The APT values of the left temporal white matter (TWM) and the right temporal gray matter (TGM) were significantly different among the three groups. Among them, the APT values of T2DM-MCI group were significantly lower. In T2DM patients, partial correlation analysis showed that the APT values of the left TWM was positively correlated with MMSE attention and calculation score, MoCA attention score, and the number of lacunar infarcts (LI), and negatively correlated with the severity of white matter hyperintensities (WMH). The APT values of right TGM was positively correlated with MoCA total scores, MoCA visuospatial scores and MoCA delayed recall scores. T2DM patients with mild cognitive impairment exhibited significantly lower APT values in the left temporal white matter and right temporal gray matter. These lower APT values were strongly associated with poorer cognitive performance and more severe CSVD. APT imaging may serve as a sensitive, noninvasive biomarker for detecting cerebral metabolic deterioration underlying diabetic cognitive decline.

MRI Classification Neurological Retrospective Clinical In Silico Academic Lab

Artificial intelligence-assisted ultrasound screening for breast cancer in China: a prospective, clustered, controlled, population-based study.

Shen J, Liu Y, Liu A, Gu X, Zhou J, Jiang P, Mo M, Zhang L, Yang C, Zhou C, Wang Z, Xie Z, Yao W, Zhou S, Zheng Y, Chang C

•papers•Oct 7 2025

Breast cancer Mammography (MAM) screening was proven to improve survival worldwide. However, younger patients with higher breast density made MAM less effective in China. It is necessary to establish Chinese-specific effective screening strategies. This study aims to explore the efficacy of artificial intelligence (AI)-assisted ultrasound breast cancer screening in China. Eligible participants were those aged 35-69 years and were attending the Chinese "Two Cancer (breast and cervical cancer) Screening" program. Two districts were selected as cluster to receive either AI-assisted ultrasound screening or routine ultrasound screening. We obtained data on cancer diagnosis through active follow-up and linkage with municipal cancer registry. The primary outcome was improved screening sensitivity enabling the detection of more true-positive cases. This study is registered at ClinicalTrials.gov under the number NCT06521788 (Initial Release Date: 07/22/2024). A total of 21,790 individuals in two districts were included in this study, with 8,736 participants in Hongkou district receiving AI-assisted ultrasound screening and 13,054 in Pudong district undergoing routine ultrasound screening. Of the 21,790 screened participants, 232 (10.7‰) tested positive, with AI detecting similar positivity rates compared to routine screening (12.2‰ vs. 9.6‰, P = 0.07). After one year of follow-up, 49 participants were diagnosed with breast cancer: 30 were screen-detected cancers, and 19 were interval cancers. The AI group demonstrated a significantly higher screening sensitivity (75%, 95% CI 54.8-88.6) compared to the routine group (42.8%, 95% CI 22.6-65.6). AI-assisted screening identified more breast cancers than the routine screening group (AI: 21 of 8736; routine: 9 of 13,054, P = 0.001). However, there was no significant difference between the two groups in terms of interval cancer detection (AI: 7 of 8736; routine: 12 of 13,054, P = 0.789). Furthermore, the proportion of early-stage cancers among screen-detected cases was significantly higher in the AI group (95.2%, 20/21) than in the routine group (88.9%, 8/9; p < 0.001). AI-assisted ultrasound screening significantly increases the detection rate of early breast cancers. Trial registration This study is registered at ClinicalTrials.gov under the number NCT06521788 (Initial Release Date: 07/22/2024).

Ultrasound Detection Breast Prospective Clinical Pilot Academic Lab Benchmark SOTA

An intelligent healthcare system for rare disease diagnosis utilizing electronic health records based on a knowledge-guided multimodal transformer framework.

Abugabah A, Shukla PK, Shukla PK, Pandey A

•papers•Oct 7 2025

Rare diseases are a common problem with millions of patients globally, but their diagnosis is difficult because of varied clinical presentations, small sample size, and disparate biomedical data sources. Current diagnostic tools are not able to combine multimodal information effectively, which results in a timely or wrong diagnosis. To fill this gap, this paper suggests a smart multimodal healthcare framework integrating electronic health records (EHRs), genomic sequences, and medical imaging to improve the detection of rare diseases. The framework uses Swin Transformer to extract hierarchical visual features in radiographic scans, Med-BERT and Transformer-XL to learn semantic and long-term temporal relations in longitudinal electronic health record narratives, and a Graph Neural Network (GNN)-based encoder to learn functional and structural relations in genomic sequences. The alignment of the cross-modal representation is further boosted with a Knowledge-Guided Contrastive Learning (KGCL) mechanism, which takes advantage of rare disease ontologies in Orphanet to improve the interpretability of the model and infusion of knowledge. To achieve strong performance, the Nutcracker Optimization Algorithm (NOA) is proposed to optimize hyperparameters, calibrate attention mechanisms, and enhance multimodal fusion. Experimental results on MIMIC-IV (EHR), ClinVar (genomics), and CheXpert (imaging) datasets show that the proposed framework significantly outperforms the state-of-the-art multimodal baselines in terms of accuracy and robustness of early rare disease diagnosis. This paper presents the opportunity to integrate hierarchical vision transformers, domain-specific language models, graph-based genomic encoders, and knowledge-directed optimization to make explainable, accurate, and clinically applicable healthcare decisions in rare disease settings.

X-Ray Classification Chest Methodology In Silico Academic Lab Benchmark SOTA

Enhanced Detection of Pulmonary Edema in Chest X-rays Using Deep Learning Ensembles with Attention Mechanism.

Abbasi W, Shahzadi A, Aljohani A

•papers•Oct 7 2025

Pulmonary edema, defined by the abnormal presence of excess fluid within the lungs, is a severe medical emergency that mandates accurate and immediate diagnosis. The use of classical diagnostic techniques-inspection, palpation, percussion, and auscultation-tends to be subjective and highly dependent on the clinician's experience, potentially resulting in variability in diagnosis and possible delays in treatment. This work provides a deep learning approach to the automatic diagnosis of pulmonary edema from chest X-ray images based on the NIH Chest X-ray dataset. The model based on the proposed CNN obtained a validation loss of 0.3350, an accuracy of 90%, and an F1-score of 0.91. The cross-validation further proved the model to be robust, with a total accuracy of 87%. These findings illustrate the performance of the model in the effective classification of pulmonary edema, hence facilitating quicker and more accurate clinical decision-making. Feature learning and representation were achieved with CNNs, boosted with attention and data augmentation strategies to favor generalization across patient populations and image variations. The integration of transparency aids like attention maps is imperative to validate the model's decision-making process, meeting the key criteria for clinical approval. In summary, this research provides a prospective solution to the early diagnosis of pulmonary edema, further leading to enhanced diagnostic processes and better patient care.

X-Ray Classification Chest Methodology In Silico

DECTGoutSys: Reducing False Positive Gout Diagnoses via a Machine Vision Pipeline for Crystal Tophi Identification+Classification in Dual-Energy Computed Tomography (DECT).

Castro-Zunti R, Choi Y, Choi Y, Chae HS, Jin GY, Park EH, Ko SB

•papers•Oct 7 2025

Gout is the world's foremost chronic inflammatory arthritis. Dual-energy computed tomography (DECT) images tophi-monosodium urate (MSU) crystal deposits that indicate gout-as an easily recognizable green color, facilitating high sensitivity. However, tophi-like regions ("artifacts") may be found in healthy controls, degrading specificity. To mitigate false positives, we propose the first automated system to localize MSU-presenting crystal deposits from DECT and classify them as gouty tophi or artifacts. Our solution, developed using 47 gout and 27 control patient scans, is three-stage. First, a computer vision algorithm crops green regions of interest (RoIs) from a patient's DECT scan frames and filters obvious false positives. Next, extracted RoIs are classified as tophi or artifact via one of three fine-tuned deep learning models; one model is trained to predict "small" RoIs, another "medium," and the third predicts "large" RoIs. Size thresholds are based on pixel area quartile statistics. Patient-level gout versus control classification is made via a machine learning system trained using a suite of features calculated from the outcomes of the RoI classifiers. Using 6-fold cross-validation, the proposed pipeline achieved a patient-level diagnostic accuracy, sensitivity, and specificity of 91.89%, 87.23%, and 100.00%. Using confidence values derived from the majority vote of RoI predictions, the best area under the receiver operator characteristics curve (ROC AUC) is 97.16%. The best RoI-level classifiers achieved mean tophus versus artifact accuracy, sensitivity, specificity, and ROC AUC of 89.61%, 85.42%, 93.70%, and 92.72%. Results demonstrate that machine/deep learning facilitates high-specificity gout diagnoses while maintaining respectable sensitivity.

CT Classification Retrospective Clinical In Silico Academic Lab

Filter Papers

Tags

Towards Data-Efficient Medical Imaging: A Generative and Semi-Supervised Framework

Deformable Image Registration for Self-supervised Cardiac Phase Detection in Multi-View Multi-Disease Cardiac Magnetic Resonance Images

Large Language Model-Based Uncertainty-Adjusted Label Extraction for Artificial Intelligence Model Development in Upper Extremity Radiography

Shaken or Stirred? An Analysis of MetaFormer's Token Mixing for Medical Imaging

Modulated INR with Prior Embeddings for Ultrasound Imaging Reconstruction

Amide proton transfer imaging reveals cerebral metabolic alterations associated with cognitive impairment in type 2 diabetes mellitus.

Artificial intelligence-assisted ultrasound screening for breast cancer in China: a prospective, clustered, controlled, population-based study.

An intelligent healthcare system for rare disease diagnosis utilizing electronic health records based on a knowledge-guided multimodal transformer framework.

Enhanced Detection of Pulmonary Edema in Chest X-rays Using Deep Learning Ensembles with Attention Mechanism.

DECTGoutSys: Reducing False Positive Gout Diagnoses via a Machine Vision Pipeline for Crystal Tophi Identification+Classification in Dual-Energy Computed Tomography (DECT).

Ready to Sharpen Your Edge?