Latest Papers on Radiology AI. Tags: None

CT-based auto-segmentation of multiple target volumes for all-in-one radiotherapy in rectal cancer patients.

Li X, Wang L, Yang M, Li X, Zhao T, Wang M, Lu S, Ji Y, Zhang W, Jia L, Peng R, Wang J, Wang H

•papers•Aug 19 2025

This study aimed to evaluate the clinical feasibility and performance of CT-based auto-segmentation models integrated into an All-in-One radiotherapy workflow for rectal cancer. This study included 312 rectal cancer patients, with 272 used to train three nnU-Net models for CTV45, CTV50, and GTV segmentation, and 40 for evaluation across one internal (n = 10), one clinical AIO (n = 10), and two external cohorts (n = 10 each). Segmentation accuracy (DSC, HD, HD95, ASSD, ASD) and time efficiency were assessed. In the internal testing set, mean DSC of CTV45, CTV50, and GTV were 0.90, 0.86, and 0.71; HD were 17.08, 25.48, and 79.59 mm; HD 95 were 4.89, 7.33, and 56.49 mm; ASSD were 1.23, 1.90, and 6.69 mm; and ASD were 1.24, 1.58, and 11.61 mm. Auto-segmentation reduced manual delineation time by 63.3–88.3% (p < 0.0001). In clinical practice, average DSC of CTV45, CTV50 and GTV were 0.93, 0.88, and 0.78; HD were 13.56, 23.84, and 35.38 mm; HD 95 were 3.33, 6.46, and 21.34 mm; ASSD were 0.78, 1.49, and 3.30 mm; and ASD were 0.74, 1.18, and 2.13 mm. The results from the multi-center testing also showed applicability of these models, since the average DSC of CTV45 and GTV were 0.84 and 0.80 respectively. The models demonstrated high accuracy and clinical utility, effectively streamlining target volume delineation and reducing manual workload in routine practice. The study protocol was approved by the Institutional Review Board of Peking University Third Hospital (Approval No. (2024) Medical Ethics Review No. 182-01).

CT Segmentation Abdominal Retrospective Clinical Clinical Pilot Academic Lab Benchmark SOTA

Longitudinal CE-MRI-based Siamese network with machine learning to predict tumor response in HCC after DEB-TACE.

Wei N, Mathy RM, Chang DH, Mayer P, Liermann J, Springfeld C, Dill MT, Longerich T, Lurje G, Kauczor HU, Wielpütz MO, Öcal O

•papers•Aug 19 2025

Accurate prediction of tumor response after drug-eluting beads transarterial chemoembolization (DEB-TACE) remains challenging in hepatocellular carcinoma (HCC), given tumor heterogeneity and dynamic changes over time. Existing prediction models based on single timepoint imaging do not capture dynamic treatment-induced changes. This study aims to develop and validate a predictive model that integrates deep learning and machine learning algorithms on longitudinal contrast-enhanced MRI (CE-MRI) to predict treatment response in HCC patients undergoing DEB-TACE. This retrospective study included 202 HCC patients treated with DEB-TACE from 2004 to 2023, divided into a training cohort (n = 141) and validation cohort (n = 61). Radiomics and deep learning features were extracted from standardized longitudinal CE-MRI to capture dynamic tumor changes. Feature selection involved correlation analysis, minimum redundancy maximum relevance, and least absolute shrinkage and selection operator regression. The patients were categorized into two groups: the objective response group (n = 123, 60.9%; complete response = 35, 28.5%; partial response = 88, 71.5%) and the non-response group (n = 79, 39.1%; stable disease = 62, 78.5%; progressive disease = 17, 21.5%). Predictive models were constructed using radiomics, deep learning, and integrated features. The area under the receiver operating characteristic curve (AUC) was used to evaluate the performance of the models. We retrospectively evaluated 202 patients (62.67 ± 9.25 years old) with HCC treated after DEB-TACE. A total of 7,182 radiomics features and 4,096 deep learning features were extracted from the longitudinal CE-MRI images. The integrated model was developed using 13 quantitative radiomics features and 4 deep learning features and demonstrated acceptable and robust performance with an receiver operating characteristic curve (AUC) of 0.941 (95%CI: 0.893–0.989) in the training cohort, and AUC of 0.925 (95%CI: 0.850–0.998) with accuracy of 86.9%, sensitivity of 83.7%, as well as specificity of 94.4% in the validation set. This study presents a predictive model based on longitudinal CE-MRI data to estimate tumor response to DEB-TACE in HCC patients. By capturing tumor dynamics and integrating radiomics features with deep learning features, the model has the potential to guide individualized treatment strategies and inform clinical decision-making regarding patient management. The online version contains supplementary material available at 10.1186/s40644-025-00926-5.

MRI Classification Abdominal Retrospective Clinical In Silico

Development and validation of 3D super-resolution convolutional neural network for 18F-FDG-PET images.

Endo H, Hirata K, Magota K, Yoshimura T, Katoh C, Kudo K

•papers•Aug 19 2025

Positron emission tomography (PET) is a valuable tool for cancer diagnosis but generally has a lower spatial resolution compared to computed tomography (CT) or magnetic resonance imaging (MRI). High-resolution PET scanners that use silicon photomultipliers and time-of-flight measurements are expensive. Therefore, cost-effective software-based super-resolution methods are required. This study proposes a novel approach for enhancing whole-body PET image resolution applying a 2.5-dimensional Super-Resolution Convolutional Neural Network (2.5D-SRCNN) combined with logarithmic transformation preprocessing. This method aims to improve image quality and maintain quantitative accuracy, particularly for standardized uptake value measurements, while addressing the challenges of providing a memory-efficient alternative to full three-dimensional processing and managing the wide dynamic range of tracer uptake in PET images. We analyzed data from 90 patients who underwent whole-body FDG-PET/CT examinations and reconstructed low-resolution slices with a voxel size of 4 × 4 × 4 mm and corresponding high-resolution (HR) slices with a voxel size of 2 × 2 × 2 mm. The proposed 2.5D-SRCNN model, based on the conventional 2D-SRCNN structure, incorporates information from adjacent slices to generate a high-resolution output. Logarithmic transformation of the voxel values was applied to manage the large dynamic range caused by physiological tracer accumulation in the bladder. Performance was assessed using the peak signal-to-noise ratio (PSNR) and structural similarity index (SSIM). The quantitative accuracy of standardized uptake values (SUV) was validated using a phantom study. The results demonstrated that the 2.5D-SRCNN with logarithmic transformation significantly outperformed the conventional 2D-SRCNN in terms of PSNR and SSIM (p < 0.0001). The proposed method also showed an improved depiction of small spheres in the phantom while maintaining the accuracy of the SUV. Our proposed method for whole-body PET images using a super-resolution model with the 2.5D approach and logarithmic transformation may be effective in generating super-resolution images with a lower spatial error and better quantitative accuracy. The online version contains supplementary material available at 10.1186/s40658-025-00791-y.

PET Reconstruction Whole Body Methodology In Silico Academic Lab Reproducibility

TME-guided deep learning predicts chemotherapy and immunotherapy response in gastric cancer with attention-enhanced residual Swin Transformer.

Sang S, Sun Z, Zheng W, Wang W, Islam MT, Chen Y, Yuan Q, Cheng C, Xi S, Han Z, Zhang T, Wu L, Li W, Xie J, Feng W, Chen Y, Xiong W, Yu J, Li G, Li Z, Jiang Y

•papers•Aug 19 2025

Adjuvant chemotherapy and immune checkpoint blockade exert quite durable anti-tumor responses, but the lack of effective biomarkers limits the therapeutic benefits. Utilizing multi-cohorts of 3,095 patients with gastric cancer, we propose an attention-enhanced residual Swin Transformer network to predict chemotherapy response (main task), and two predicting subtasks (ImmunoScore and periostin [POSTN]) are used as intermediate tasks to improve the model's performance. Furthermore, we assess whether the model can identify which patients would benefit from immunotherapy. The deep learning model achieves high accuracy in predicting chemotherapy response and the tumor microenvironment (ImmunoScore and POSTN). We further find that the model can identify which patient may benefit from checkpoint blockade immunotherapy. This approach offers precise chemotherapy and immunotherapy response predictions, opening avenues for personalized treatment options. Prospective studies are warranted to validate its clinical utility.

Mixed Modality Classification Abdominal Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Pixels Under Pressure: Exploring Fine-Tuning Paradigms for Foundation Models in High-Resolution Medical Imaging

Zahra TehraniNasab, Amar Kumar, Tal Arbel

•preprint•Aug 19 2025

Advancements in diffusion-based foundation models have improved text-to-image generation, yet most efforts have been limited to low-resolution settings. As high-resolution image synthesis becomes increasingly essential for various applications, particularly in medical imaging domains, fine-tuning emerges as a crucial mechanism for adapting these powerful pre-trained models to task-specific requirements and data distributions. In this work, we present a systematic study, examining the impact of various fine-tuning techniques on image generation quality when scaling to high resolution 512x512 pixels. We benchmark a diverse set of fine-tuning methods, including full fine-tuning strategies and parameter-efficient fine-tuning (PEFT). We dissect how different fine-tuning methods influence key quality metrics, including Fr\'echet Inception Distance (FID), Vendi score, and prompt-image alignment. We also evaluate the utility of generated images in a downstream classification task under data-scarce conditions, demonstrating that specific fine-tuning strategies improve both generation fidelity and downstream performance when synthetic images are used for classifier training and evaluation on real images. Our code is accessible through the project website - https://tehraninasab.github.io/PixelUPressure/.

Mixed Modality Image Synthesis Methodology In Silico Academic Lab Open Code

Multi-View Echocardiographic Embedding for Accessible AI Development

Tohyama, T., Han, A., Yoon, D., Paik, K., Gow, B., Izath, N., Kpodonu, J., Celi, L. A.

•preprint•Aug 19 2025

Background and AimsEchocardiography serves as a cornerstone of cardiovascular diagnostics through multiple standardized imaging views. While recent AI foundation models demonstrate superior capabilities across cardiac imaging tasks, their massive computational requirements and reliance on large-scale datasets create accessibility barriers, limiting AI development to well-resourced institutions. Vector embedding approaches offer promising solutions by leveraging compact representations from original medical images for downstream applications. Furthermore, demographic fairness remains critical, as AI models may incorporate biases that confound clinically relevant features. We developed a multi-view encoder framework to address computational accessibility while investigating demographic fairness challenges. MethodsWe utilized the MIMIC-IV-ECHO dataset (7,169 echocardiographic studies) to develop a transformer-based multi-view encoder that aggregates view-level representations into study-level embeddings. The framework incorporated adversarial learning to suppress demographic information while maintaining clinical performance. We evaluated performance across 21 binary classification tasks encompassing echocardiographic measurements and clinical diagnoses, comparing against foundation model baselines with varying adversarial weights. ResultsThe multi-view encoder achieved a mean improvement of 9.0 AUC points (12.0% relative improvement) across clinical tasks compared to foundation model embeddings. Performance remained robust with limited echocardiographic views compared to the conventional approach. However, adversarial learning showed limited effectiveness in reducing demographic shortcuts, with stronger weighting substantially compromising diagnostic performance. ConclusionsOur framework democratizes advanced cardiac AI capabilities, enabling substantial diagnostic improvements without massive computational infrastructure. While algorithmic approaches to demographic fairness showed limitations, the multi-view encoder provides a practical pathway for broader AI adoption in cardiovascular medicine with enhanced efficiency in real-world clinical settings. Structured graphical abstract or graphical abstractO_ST_ABSKey QuestionC_ST_ABSCan multi-view encoder frameworks achieve superior diagnostic performance compared to foundation model embeddings while reducing computational requirements and maintaining robust performance with fewer echocardiographic views for cardiac AI applications? Key FindingMulti-view encoder achieved 12.0% relative improvement (9.0 AUC points) across 21 cardiac tasks compared to foundation model baselines, with efficient 512-dimensional vector embeddings and robust performance using fewer echocardiographic views. Take-home MessageVector embedding approaches with attention-based multi-view integration significantly improve cardiac diagnostic performance while reducing computational requirements, offering a pathway toward more efficient AI implementation in clinical settings. O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=83 SRC="FIGDIR/small/25333725v1_ufig1.gif" ALT="Figure 1"> View larger version (22K): [email protected]@a75818org.highwire.dtl.DTLVardef@88a588org.highwire.dtl.DTLVardef@12bad06_HPS_FORMAT_FIGEXP M_FIG C_FIG Translational PerspectiveOur proposed multi-view encoder framework overcomes critical barriers to the widespread adoption of artificial intelligence in echocardiography. By dramatically reducing computational requirements, the multi-view encoder approach allows smaller healthcare institutions to develop sophisticated AI models locally. The framework maintains robust performance with fewer echocardiographic examinations, which addresses real-world clinical constraints where comprehensive imaging is not feasible due to patient factors or time limitations. This technology provides a practical way to democratize advanced cardiac AI capabilities, which could improve access to cardiovascular care across diverse healthcare settings while reducing dependence on proprietary datasets and massive computational resources.

Ultrasound Classification Cardiac Methodology In Silico Academic Lab GenAI Benchmark SOTA

Automated Protocol Suggestions for Cranial MRI Examinations Using Locally Fine-tuned BERT Models.

Boschenriedter C, Rubbert C, Vach M, Caspers J

•papers•Aug 18 2025

Selection of appropriate imaging sequences protocols for cranial magnetic resonance imaging (MRI) is crucial to address the medical question and adequately support patient care. Inappropriate protocol selection can compromise diagnostic accuracy, extend scan duration, and increase the risk of misdiagnosis. Typically, radiologists determine scanning protocols based on their expertise, a process that can be time-consuming and subject to variability. Language models offer the potential to streamline this process. This study investigates the capability of bidirectional encoder representations from transformers (BERT)-based models to suggest appropriate MRI protocols based on referral information.A total of 410 anonymized electronic referrals for cranial MRI from a local order-entry system were categorized into nine protocol classes by an experienced neuroradiologist. A locally hosted instance of four different, pre-trained BERT-based classifiers (BERT, ModernBERT, GottBERT, and medBERT.de) were trained to classify protocols based on referral entries, including preliminary diagnoses, prior treatment history, and clinical questions. Each model was additionally fine-tuned for local language on a large dataset of electronic referrals.The model based on medBERT.de with local language fine-tuning was the best-performing model and correctly predicted 81% of all protocols, achieving a macro-F1 score of 0.71, macro-precision and macro-recall values of 0.73 and 0.71, respectively. Moreover, we were able to show that local language fine-tuning led to performance improvements across all models.These results demonstrate the potential of language models to predict MRI protocols, even with limited training data. This approach could accelerate and standardize radiological protocol selection, offering significant benefits for clinical workflows.

MRI Classification Neurological Retrospective Clinical In Silico Academic Lab

Multi-Phase Automated Segmentation of Dental Structures in CBCT Using a Lightweight Auto3DSeg and SegResNet Implementation

Dominic LaBella, Keshav Jha, Jared Robbins, Esther Yu

•preprint•Aug 18 2025

Cone-beam computed tomography (CBCT) has become an invaluable imaging modality in dentistry, enabling 3D visualization of teeth and surrounding structures for diagnosis and treatment planning. Automated segmentation of dental structures in CBCT can efficiently assist in identifying pathology (e.g., pulpal or periapical lesions) and facilitate radiation therapy planning in head and neck cancer patients. We describe the DLaBella29 team's approach for the MICCAI 2025 ToothFairy3 Challenge, which involves a deep learning pipeline for multi-class tooth segmentation. We utilized the MONAI Auto3DSeg framework with a 3D SegResNet architecture, trained on a subset of the ToothFairy3 dataset (63 CBCT scans) with 5-fold cross-validation. Key preprocessing steps included image resampling to 0.6 mm isotropic resolution and intensity clipping. We applied an ensemble fusion using Multi-Label STAPLE on the 5-fold predictions to infer a Phase 1 segmentation and then conducted tight cropping around the easily segmented Phase 1 mandible to perform Phase 2 segmentation on the smaller nerve structures. Our method achieved an average Dice of 0.87 on the ToothFairy3 challenge out-of-sample validation set. This paper details the clinical context, data preparation, model development, results of our approach, and discusses the relevance of automated dental segmentation for improving patient care in radiation oncology.

CT Segmentation Methodology In Silico Academic Lab

Breaking Reward Collapse: Adaptive Reinforcement for Open-ended Medical Reasoning with Enhanced Semantic Discrimination

Yizhou Liu, Jingwei Wei, Zizhi Chen, Minghao Han, Xukun Zhang, Keliang Liu, Lihua Zhang

•preprint•Aug 18 2025

Reinforcement learning (RL) with rule-based rewards has demonstrated strong potential in enhancing the reasoning and generalization capabilities of vision-language models (VLMs) and large language models (LLMs), while reducing computational overhead. However, its application in medical imaging remains underexplored. Existing reinforcement fine-tuning (RFT) approaches in this domain primarily target closed-ended visual question answering (VQA), limiting their applicability to real-world clinical reasoning. In contrast, open-ended medical VQA better reflects clinical practice but has received limited attention. While some efforts have sought to unify both formats via semantically guided RL, we observe that model-based semantic rewards often suffer from reward collapse, where responses with significant semantic differences receive similar scores. To address this, we propose ARMed (Adaptive Reinforcement for Medical Reasoning), a novel RL framework for open-ended medical VQA. ARMed first incorporates domain knowledge through supervised fine-tuning (SFT) on chain-of-thought data, then applies reinforcement learning with textual correctness and adaptive semantic rewards to enhance reasoning quality. We evaluate ARMed on six challenging medical VQA benchmarks. Results show that ARMed consistently boosts both accuracy and generalization, achieving a 32.64% improvement on in-domain tasks and an 11.65% gain on out-of-domain benchmarks. These results highlight the critical role of reward discriminability in medical RL and the promise of semantically guided rewards for enabling robust and clinically meaningful multimodal reasoning.

Mixed Modality LLM Radiology Report Methodology In Silico Academic Lab Benchmark SOTA

CTFlow: Video-Inspired Latent Flow Matching for 3D CT Synthesis

Jiayi Wang, Hadrien Reynaud, Franciskus Xaverius Erick, Bernhard Kainz

•preprint•Aug 18 2025

Generative modelling of entire CT volumes conditioned on clinical reports has the potential to accelerate research through data augmentation, privacy-preserving synthesis and reducing regulator-constraints on patient data while preserving diagnostic signals. With the recent release of CT-RATE, a large-scale collection of 3D CT volumes paired with their respective clinical reports, training large text-conditioned CT volume generation models has become achievable. In this work, we introduce CTFlow, a 0.5B latent flow matching transformer model, conditioned on clinical reports. We leverage the A-VAE from FLUX to define our latent space, and rely on the CT-Clip text encoder to encode the clinical reports. To generate consistent whole CT volumes while keeping the memory constraints tractable, we rely on a custom autoregressive approach, where the model predicts the first sequence of slices of the volume from text-only, and then relies on the previously generated sequence of slices and the text, to predict the following sequence. We evaluate our results against state-of-the-art generative CT model, and demonstrate the superiority of our approach in terms of temporal coherence, image diversity and text-image alignment, with FID, FVD, IS scores and CLIP score.

CT Image Synthesis Whole Body Methodology In Silico Academic Lab Open Dataset

Filter Papers

Tags

CT-based auto-segmentation of multiple target volumes for all-in-one radiotherapy in rectal cancer patients.

Longitudinal CE-MRI-based Siamese network with machine learning to predict tumor response in HCC after DEB-TACE.

Development and validation of 3D super-resolution convolutional neural network for <sup>18</sup>F-FDG-PET images.

TME-guided deep learning predicts chemotherapy and immunotherapy response in gastric cancer with attention-enhanced residual Swin Transformer.

Pixels Under Pressure: Exploring Fine-Tuning Paradigms for Foundation Models in High-Resolution Medical Imaging

Multi-View Echocardiographic Embedding for Accessible AI Development

Automated Protocol Suggestions for Cranial MRI Examinations Using Locally Fine-tuned BERT Models.

Multi-Phase Automated Segmentation of Dental Structures in CBCT Using a Lightweight Auto3DSeg and SegResNet Implementation

Breaking Reward Collapse: Adaptive Reinforcement for Open-ended Medical Reasoning with Enhanced Semantic Discrimination

CTFlow: Video-Inspired Latent Flow Matching for 3D CT Synthesis

Ready to Sharpen Your Edge?