Latest Papers on Radiology AI.

Deep learning for detection and diagnosis of intrathoracic lymphadenopathy from endobronchial ultrasound multimodal videos: A multi-center study.

Chen J, Li J, Zhang C, Zhi X, Wang L, Zhang Q, Yu P, Tang F, Zha X, Wang L, Dai W, Xiong H, Sun J

•papers•Aug 19 2025

Convex probe endobronchial ultrasound (CP-EBUS) ultrasonographic features are important for diagnosing intrathoracic lymphadenopathy. Conventional methods for CP-EBUS imaging analysis rely heavily on physician expertise. To overcome this obstacle, we propose a deep learning-aided diagnostic system (AI-CEMA) to automatically select representative images, identify lymph nodes (LNs), and differentiate benign from malignant LNs based on CP-EBUS multimodal videos. AI-CEMA is first trained using 1,006 LNs from a single center and validated with a retrospective study and then demonstrated with a prospective multi-center study on 267 LNs. AI-CEMA achieves an area under the curve (AUC) of 0.8490 (95% confidence interval [CI], 0.8000-0.8980), which is comparable to experienced experts (AUC, 0.7847 [95% CI, 0.7320-0.8373]; p = 0.080). Additionally, AI-CEMA is successfully transferred to a pulmonary lesion diagnosis task and obtains a commendable AUC of 0.8192 (95% CI, 0.7676-0.8709). In conclusion, AI-CEMA shows great potential in clinical diagnosis of intrathoracic lymphadenopathy and pulmonary lesions by providing automated, noninvasive, and expert-level diagnosis.

Ultrasound Detection Chest Prospective Clinical Pilot

Longitudinal CE-MRI-based Siamese network with machine learning to predict tumor response in HCC after DEB-TACE.

Wei N, Mathy RM, Chang DH, Mayer P, Liermann J, Springfeld C, Dill MT, Longerich T, Lurje G, Kauczor HU, Wielpütz MO, Öcal O

•papers•Aug 19 2025

Accurate prediction of tumor response after drug-eluting beads transarterial chemoembolization (DEB-TACE) remains challenging in hepatocellular carcinoma (HCC), given tumor heterogeneity and dynamic changes over time. Existing prediction models based on single timepoint imaging do not capture dynamic treatment-induced changes. This study aims to develop and validate a predictive model that integrates deep learning and machine learning algorithms on longitudinal contrast-enhanced MRI (CE-MRI) to predict treatment response in HCC patients undergoing DEB-TACE. This retrospective study included 202 HCC patients treated with DEB-TACE from 2004 to 2023, divided into a training cohort (n = 141) and validation cohort (n = 61). Radiomics and deep learning features were extracted from standardized longitudinal CE-MRI to capture dynamic tumor changes. Feature selection involved correlation analysis, minimum redundancy maximum relevance, and least absolute shrinkage and selection operator regression. The patients were categorized into two groups: the objective response group (n = 123, 60.9%; complete response = 35, 28.5%; partial response = 88, 71.5%) and the non-response group (n = 79, 39.1%; stable disease = 62, 78.5%; progressive disease = 17, 21.5%). Predictive models were constructed using radiomics, deep learning, and integrated features. The area under the receiver operating characteristic curve (AUC) was used to evaluate the performance of the models. We retrospectively evaluated 202 patients (62.67 ± 9.25 years old) with HCC treated after DEB-TACE. A total of 7,182 radiomics features and 4,096 deep learning features were extracted from the longitudinal CE-MRI images. The integrated model was developed using 13 quantitative radiomics features and 4 deep learning features and demonstrated acceptable and robust performance with an receiver operating characteristic curve (AUC) of 0.941 (95%CI: 0.893–0.989) in the training cohort, and AUC of 0.925 (95%CI: 0.850–0.998) with accuracy of 86.9%, sensitivity of 83.7%, as well as specificity of 94.4% in the validation set. This study presents a predictive model based on longitudinal CE-MRI data to estimate tumor response to DEB-TACE in HCC patients. By capturing tumor dynamics and integrating radiomics features with deep learning features, the model has the potential to guide individualized treatment strategies and inform clinical decision-making regarding patient management. The online version contains supplementary material available at 10.1186/s40644-025-00926-5.

MRI Classification Abdominal Retrospective Clinical In Silico

Development and validation of 3D super-resolution convolutional neural network for 18F-FDG-PET images.

Endo H, Hirata K, Magota K, Yoshimura T, Katoh C, Kudo K

•papers•Aug 19 2025

Positron emission tomography (PET) is a valuable tool for cancer diagnosis but generally has a lower spatial resolution compared to computed tomography (CT) or magnetic resonance imaging (MRI). High-resolution PET scanners that use silicon photomultipliers and time-of-flight measurements are expensive. Therefore, cost-effective software-based super-resolution methods are required. This study proposes a novel approach for enhancing whole-body PET image resolution applying a 2.5-dimensional Super-Resolution Convolutional Neural Network (2.5D-SRCNN) combined with logarithmic transformation preprocessing. This method aims to improve image quality and maintain quantitative accuracy, particularly for standardized uptake value measurements, while addressing the challenges of providing a memory-efficient alternative to full three-dimensional processing and managing the wide dynamic range of tracer uptake in PET images. We analyzed data from 90 patients who underwent whole-body FDG-PET/CT examinations and reconstructed low-resolution slices with a voxel size of 4 × 4 × 4 mm and corresponding high-resolution (HR) slices with a voxel size of 2 × 2 × 2 mm. The proposed 2.5D-SRCNN model, based on the conventional 2D-SRCNN structure, incorporates information from adjacent slices to generate a high-resolution output. Logarithmic transformation of the voxel values was applied to manage the large dynamic range caused by physiological tracer accumulation in the bladder. Performance was assessed using the peak signal-to-noise ratio (PSNR) and structural similarity index (SSIM). The quantitative accuracy of standardized uptake values (SUV) was validated using a phantom study. The results demonstrated that the 2.5D-SRCNN with logarithmic transformation significantly outperformed the conventional 2D-SRCNN in terms of PSNR and SSIM (p < 0.0001). The proposed method also showed an improved depiction of small spheres in the phantom while maintaining the accuracy of the SUV. Our proposed method for whole-body PET images using a super-resolution model with the 2.5D approach and logarithmic transformation may be effective in generating super-resolution images with a lower spatial error and better quantitative accuracy. The online version contains supplementary material available at 10.1186/s40658-025-00791-y.

PET Reconstruction Whole Body Methodology In Silico Academic Lab Reproducibility

TME-guided deep learning predicts chemotherapy and immunotherapy response in gastric cancer with attention-enhanced residual Swin Transformer.

Sang S, Sun Z, Zheng W, Wang W, Islam MT, Chen Y, Yuan Q, Cheng C, Xi S, Han Z, Zhang T, Wu L, Li W, Xie J, Feng W, Chen Y, Xiong W, Yu J, Li G, Li Z, Jiang Y

•papers•Aug 19 2025

Adjuvant chemotherapy and immune checkpoint blockade exert quite durable anti-tumor responses, but the lack of effective biomarkers limits the therapeutic benefits. Utilizing multi-cohorts of 3,095 patients with gastric cancer, we propose an attention-enhanced residual Swin Transformer network to predict chemotherapy response (main task), and two predicting subtasks (ImmunoScore and periostin [POSTN]) are used as intermediate tasks to improve the model's performance. Furthermore, we assess whether the model can identify which patients would benefit from immunotherapy. The deep learning model achieves high accuracy in predicting chemotherapy response and the tumor microenvironment (ImmunoScore and POSTN). We further find that the model can identify which patient may benefit from checkpoint blockade immunotherapy. This approach offers precise chemotherapy and immunotherapy response predictions, opening avenues for personalized treatment options. Prospective studies are warranted to validate its clinical utility.

Mixed Modality Classification Abdominal Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Pixels Under Pressure: Exploring Fine-Tuning Paradigms for Foundation Models in High-Resolution Medical Imaging

Zahra TehraniNasab, Amar Kumar, Tal Arbel

•preprint•Aug 19 2025

Advancements in diffusion-based foundation models have improved text-to-image generation, yet most efforts have been limited to low-resolution settings. As high-resolution image synthesis becomes increasingly essential for various applications, particularly in medical imaging domains, fine-tuning emerges as a crucial mechanism for adapting these powerful pre-trained models to task-specific requirements and data distributions. In this work, we present a systematic study, examining the impact of various fine-tuning techniques on image generation quality when scaling to high resolution 512x512 pixels. We benchmark a diverse set of fine-tuning methods, including full fine-tuning strategies and parameter-efficient fine-tuning (PEFT). We dissect how different fine-tuning methods influence key quality metrics, including Fr\'echet Inception Distance (FID), Vendi score, and prompt-image alignment. We also evaluate the utility of generated images in a downstream classification task under data-scarce conditions, demonstrating that specific fine-tuning strategies improve both generation fidelity and downstream performance when synthetic images are used for classifier training and evaluation on real images. Our code is accessible through the project website - https://tehraninasab.github.io/PixelUPressure/.

Mixed Modality Image Synthesis Methodology In Silico Academic Lab Open Code

Latent Interpolation Learning Using Diffusion Models for Cardiac Volume Reconstruction

Niklas Bubeck, Suprosanna Shit, Chen Chen, Can Zhao, Pengfei Guo, Dong Yang, Georg Zitzlsberger, Daguang Xu, Bernhard Kainz, Daniel Rueckert, Jiazhen Pan

•preprint•Aug 19 2025

Cardiac Magnetic Resonance (CMR) imaging is a critical tool for diagnosing and managing cardiovascular disease, yet its utility is often limited by the sparse acquisition of 2D short-axis slices, resulting in incomplete volumetric information. Accurate 3D reconstruction from these sparse slices is essential for comprehensive cardiac assessment, but existing methods face challenges, including reliance on predefined interpolation schemes (e.g., linear or spherical), computational inefficiency, and dependence on additional semantic inputs such as segmentation labels or motion data. To address these limitations, we propose a novel Cardiac Latent Interpolation Diffusion (CaLID) framework that introduces three key innovations. First, we present a data-driven interpolation scheme based on diffusion models, which can capture complex, non-linear relationships between sparse slices and improves reconstruction accuracy. Second, we design a computationally efficient method that operates in the latent space and speeds up 3D whole-heart upsampling time by a factor of 24, reducing computational overhead compared to previous methods. Third, with only sparse 2D CMR images as input, our method achieves SOTA performance against baseline methods, eliminating the need for auxiliary input such as morphological guidance, thus simplifying workflows. We further extend our method to 2D+T data, enabling the effective modeling of spatiotemporal dynamics and ensuring temporal coherence. Extensive volumetric evaluations and downstream segmentation tasks demonstrate that CaLID achieves superior reconstruction quality and efficiency. By addressing the fundamental limitations of existing approaches, our framework advances the state of the art for spatio and spatiotemporal whole-heart reconstruction, offering a robust and clinically practical solution for cardiovascular imaging.

MRI Reconstruction Cardiac Methodology In Silico Benchmark SOTA

A Systematic Study of Deep Learning Models and xAI Methods for Region-of-Interest Detection in MRI Scans

Justin Yiu, Kushank Arora, Daniel Steinberg, Rohit Ghiya

•preprint•Aug 19 2025

Magnetic Resonance Imaging (MRI) is an essential diagnostic tool for assessing knee injuries. However, manual interpretation of MRI slices remains time-consuming and prone to inter-observer variability. This study presents a systematic evaluation of various deep learning architectures combined with explainable AI (xAI) techniques for automated region of interest (ROI) detection in knee MRI scans. We investigate both supervised and self-supervised approaches, including ResNet50, InceptionV3, Vision Transformers (ViT), and multiple U-Net variants augmented with multi-layer perceptron (MLP) classifiers. To enhance interpretability and clinical relevance, we integrate xAI methods such as Grad-CAM and Saliency Maps. Model performance is assessed using AUC for classification and PSNR/SSIM for reconstruction quality, along with qualitative ROI visualizations. Our results demonstrate that ResNet50 consistently excels in classification and ROI identification, outperforming transformer-based models under the constraints of the MRNet dataset. While hybrid U-Net + MLP approaches show potential for leveraging spatial features in reconstruction and interpretability, their classification performance remains lower. Grad-CAM consistently provided the most clinically meaningful explanations across architectures. Overall, CNN-based transfer learning emerges as the most effective approach for this dataset, while future work with larger-scale pretraining may better unlock the potential of transformer models.

MRI Detection Musculoskeletal Methodology In Silico GenAI

Multi-View Echocardiographic Embedding for Accessible AI Development

Tohyama, T., Han, A., Yoon, D., Paik, K., Gow, B., Izath, N., Kpodonu, J., Celi, L. A.

•preprint•Aug 19 2025

Background and AimsEchocardiography serves as a cornerstone of cardiovascular diagnostics through multiple standardized imaging views. While recent AI foundation models demonstrate superior capabilities across cardiac imaging tasks, their massive computational requirements and reliance on large-scale datasets create accessibility barriers, limiting AI development to well-resourced institutions. Vector embedding approaches offer promising solutions by leveraging compact representations from original medical images for downstream applications. Furthermore, demographic fairness remains critical, as AI models may incorporate biases that confound clinically relevant features. We developed a multi-view encoder framework to address computational accessibility while investigating demographic fairness challenges. MethodsWe utilized the MIMIC-IV-ECHO dataset (7,169 echocardiographic studies) to develop a transformer-based multi-view encoder that aggregates view-level representations into study-level embeddings. The framework incorporated adversarial learning to suppress demographic information while maintaining clinical performance. We evaluated performance across 21 binary classification tasks encompassing echocardiographic measurements and clinical diagnoses, comparing against foundation model baselines with varying adversarial weights. ResultsThe multi-view encoder achieved a mean improvement of 9.0 AUC points (12.0% relative improvement) across clinical tasks compared to foundation model embeddings. Performance remained robust with limited echocardiographic views compared to the conventional approach. However, adversarial learning showed limited effectiveness in reducing demographic shortcuts, with stronger weighting substantially compromising diagnostic performance. ConclusionsOur framework democratizes advanced cardiac AI capabilities, enabling substantial diagnostic improvements without massive computational infrastructure. While algorithmic approaches to demographic fairness showed limitations, the multi-view encoder provides a practical pathway for broader AI adoption in cardiovascular medicine with enhanced efficiency in real-world clinical settings. Structured graphical abstract or graphical abstractO_ST_ABSKey QuestionC_ST_ABSCan multi-view encoder frameworks achieve superior diagnostic performance compared to foundation model embeddings while reducing computational requirements and maintaining robust performance with fewer echocardiographic views for cardiac AI applications? Key FindingMulti-view encoder achieved 12.0% relative improvement (9.0 AUC points) across 21 cardiac tasks compared to foundation model baselines, with efficient 512-dimensional vector embeddings and robust performance using fewer echocardiographic views. Take-home MessageVector embedding approaches with attention-based multi-view integration significantly improve cardiac diagnostic performance while reducing computational requirements, offering a pathway toward more efficient AI implementation in clinical settings. O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=83 SRC="FIGDIR/small/25333725v1_ufig1.gif" ALT="Figure 1"> View larger version (22K): [email protected]@a75818org.highwire.dtl.DTLVardef@88a588org.highwire.dtl.DTLVardef@12bad06_HPS_FORMAT_FIGEXP M_FIG C_FIG Translational PerspectiveOur proposed multi-view encoder framework overcomes critical barriers to the widespread adoption of artificial intelligence in echocardiography. By dramatically reducing computational requirements, the multi-view encoder approach allows smaller healthcare institutions to develop sophisticated AI models locally. The framework maintains robust performance with fewer echocardiographic examinations, which addresses real-world clinical constraints where comprehensive imaging is not feasible due to patient factors or time limitations. This technology provides a practical way to democratize advanced cardiac AI capabilities, which could improve access to cardiovascular care across diverse healthcare settings while reducing dependence on proprietary datasets and massive computational resources.

Ultrasound Classification Cardiac Methodology In Silico Academic Lab GenAI Benchmark SOTA

Automated Protocol Suggestions for Cranial MRI Examinations Using Locally Fine-tuned BERT Models.

Boschenriedter C, Rubbert C, Vach M, Caspers J

•papers•Aug 18 2025

Selection of appropriate imaging sequences protocols for cranial magnetic resonance imaging (MRI) is crucial to address the medical question and adequately support patient care. Inappropriate protocol selection can compromise diagnostic accuracy, extend scan duration, and increase the risk of misdiagnosis. Typically, radiologists determine scanning protocols based on their expertise, a process that can be time-consuming and subject to variability. Language models offer the potential to streamline this process. This study investigates the capability of bidirectional encoder representations from transformers (BERT)-based models to suggest appropriate MRI protocols based on referral information.A total of 410 anonymized electronic referrals for cranial MRI from a local order-entry system were categorized into nine protocol classes by an experienced neuroradiologist. A locally hosted instance of four different, pre-trained BERT-based classifiers (BERT, ModernBERT, GottBERT, and medBERT.de) were trained to classify protocols based on referral entries, including preliminary diagnoses, prior treatment history, and clinical questions. Each model was additionally fine-tuned for local language on a large dataset of electronic referrals.The model based on medBERT.de with local language fine-tuning was the best-performing model and correctly predicted 81% of all protocols, achieving a macro-F1 score of 0.71, macro-precision and macro-recall values of 0.73 and 0.71, respectively. Moreover, we were able to show that local language fine-tuning led to performance improvements across all models.These results demonstrate the potential of language models to predict MRI protocols, even with limited training data. This approach could accelerate and standardize radiological protocol selection, offering significant benefits for clinical workflows.

MRI Classification Neurological Retrospective Clinical In Silico Academic Lab

HierAdaptMR: Cross-Center Cardiac MRI Reconstruction with Hierarchical Feature Adapters

Ruru Xu, Ilkay Oksuz

•preprint•Aug 18 2025

Deep learning-based cardiac MRI reconstruction faces significant domain shift challenges when deployed across multiple clinical centers with heterogeneous scanner configurations and imaging protocols. We propose HierAdaptMR, a hierarchical feature adaptation framework that addresses multi-level domain variations through parameter-efficient adapters. Our method employs Protocol-Level Adapters for sequence-specific characteristics and Center-Level Adapters for scanner-dependent variations, built upon a variational unrolling backbone. A Universal Adapter enables generalization to entirely unseen centers through stochastic training that learns center-invariant adaptations. The framework utilizes multi-scale SSIM loss with frequency domain enhancement and contrast-adaptive weighting for robust optimization. Comprehensive evaluation on the CMRxRecon2025 dataset spanning 5+ centers, 10+ scanners, and 9 modalities demonstrates superior cross-center generalization while maintaining reconstruction quality. code: https://github.com/Ruru-Xu/HierAdaptMR

MRI Reconstruction Cardiac Methodology In Silico Open Code Benchmark SOTA

Filter Papers

Tags

Deep learning for detection and diagnosis of intrathoracic lymphadenopathy from endobronchial ultrasound multimodal videos: A multi-center study.

Longitudinal CE-MRI-based Siamese network with machine learning to predict tumor response in HCC after DEB-TACE.

Development and validation of 3D super-resolution convolutional neural network for <sup>18</sup>F-FDG-PET images.

TME-guided deep learning predicts chemotherapy and immunotherapy response in gastric cancer with attention-enhanced residual Swin Transformer.

Pixels Under Pressure: Exploring Fine-Tuning Paradigms for Foundation Models in High-Resolution Medical Imaging

Latent Interpolation Learning Using Diffusion Models for Cardiac Volume Reconstruction

A Systematic Study of Deep Learning Models and xAI Methods for Region-of-Interest Detection in MRI Scans

Multi-View Echocardiographic Embedding for Accessible AI Development

Automated Protocol Suggestions for Cranial MRI Examinations Using Locally Fine-tuned BERT Models.

HierAdaptMR: Cross-Center Cardiac MRI Reconstruction with Hierarchical Feature Adapters

Ready to Sharpen Your Edge?