Latest Papers on Radiology AI.

UniSegDiff: Boosting Unified Lesion Segmentation via a Staged Diffusion Model

Yilong Hu, Shijie Chang, Lihe Zhang, Feng Tian, Weibing Sun, Huchuan Lu

•preprint•Jul 24 2025

The Diffusion Probabilistic Model (DPM) has demonstrated remarkable performance across a variety of generative tasks. The inherent randomness in diffusion models helps address issues such as blurring at the edges of medical images and labels, positioning Diffusion Probabilistic Models (DPMs) as a promising approach for lesion segmentation. However, we find that the current training and inference strategies of diffusion models result in an uneven distribution of attention across different timesteps, leading to longer training times and suboptimal solutions. To this end, we propose UniSegDiff, a novel diffusion model framework designed to address lesion segmentation in a unified manner across multiple modalities and organs. This framework introduces a staged training and inference approach, dynamically adjusting the prediction targets at different stages, forcing the model to maintain high attention across all timesteps, and achieves unified lesion segmentation through pre-training the feature extraction network for segmentation. We evaluate performance on six different organs across various imaging modalities. Comprehensive experimental results demonstrate that UniSegDiff significantly outperforms previous state-of-the-art (SOTA) approaches. The code is available at https://github.com/HUYILONG-Z/UniSegDiff.

Mixed Modality Segmentation Methodology In Silico Open Code Benchmark SOTA

Q-Former Autoencoder: A Modern Framework for Medical Anomaly Detection

Francesco Dalmonte, Emirhan Bayar, Emre Akbas, Mariana-Iuliana Georgescu

•preprint•Jul 24 2025

Anomaly detection in medical images is an important yet challenging task due to the diversity of possible anomalies and the practical impossibility of collecting comprehensively annotated data sets. In this work, we tackle unsupervised medical anomaly detection proposing a modernized autoencoder-based framework, the Q-Former Autoencoder, that leverages state-of-the-art pretrained vision foundation models, such as DINO, DINOv2 and Masked Autoencoder. Instead of training encoders from scratch, we directly utilize frozen vision foundation models as feature extractors, enabling rich, multi-stage, high-level representations without domain-specific fine-tuning. We propose the usage of the Q-Former architecture as the bottleneck, which enables the control of the length of the reconstruction sequence, while efficiently aggregating multiscale features. Additionally, we incorporate a perceptual loss computed using features from a pretrained Masked Autoencoder, guiding the reconstruction towards semantically meaningful structures. Our framework is evaluated on four diverse medical anomaly detection benchmarks, achieving state-of-the-art results on BraTS2021, RESC, and RSNA. Our results highlight the potential of vision foundation model encoders, pretrained on natural images, to generalize effectively to medical image analysis tasks without further fine-tuning. We release the code and models at https://github.com/emirhanbayar/QFAE.

Mixed Modality Detection Methodology In Silico Academic Lab Open Code

LEAF: Latent Diffusion with Efficient Encoder Distillation for Aligned Features in Medical Image Segmentation

Qilin Huang, Tianyu Lin, Zhiguang Chen, Fudan Zheng

•preprint•Jul 24 2025

Leveraging the powerful capabilities of diffusion models has yielded quite effective results in medical image segmentation tasks. However, existing methods typically transfer the original training process directly without specific adjustments for segmentation tasks. Furthermore, the commonly used pre-trained diffusion models still have deficiencies in feature extraction. Based on these considerations, we propose LEAF, a medical image segmentation model grounded in latent diffusion models. During the fine-tuning process, we replace the original noise prediction pattern with a direct prediction of the segmentation map, thereby reducing the variance of segmentation results. We also employ a feature distillation method to align the hidden states of the convolutional layers with the features from a transformer-based vision encoder. Experimental results demonstrate that our method enhances the performance of the original diffusion model across multiple segmentation datasets for different disease types. Notably, our approach does not alter the model architecture, nor does it increase the number of parameters or computation during the inference phase, making it highly efficient.

Segmentation Methodology In Silico

Differential-UMamba: Rethinking Tumor Segmentation Under Limited Data Scenarios

Dhruv Jain, Romain Modzelewski, Romain Hérault, Clement Chatelain, Eva Torfeh, Sebastien Thureau

•preprint•Jul 24 2025

In data-scarce scenarios, deep learning models often overfit to noise and irrelevant patterns, which limits their ability to generalize to unseen samples. To address these challenges in medical image segmentation, we introduce Diff-UMamba, a novel architecture that combines the UNet framework with the mamba mechanism for modeling long-range dependencies. At the heart of Diff-UMamba is a Noise Reduction Module (NRM), which employs a signal differencing strategy to suppress noisy or irrelevant activations within the encoder. This encourages the model to filter out spurious features and enhance task-relevant representations, thereby improving its focus on clinically meaningful regions. As a result, the architecture achieves improved segmentation accuracy and robustness, particularly in low-data settings. Diff-UMamba is evaluated on multiple public datasets, including MSD (lung and pancreas) and AIIB23, demonstrating consistent performance gains of 1-3% over baseline methods across diverse segmentation tasks. To further assess performance under limited-data conditions, additional experiments are conducted on the BraTS-21 dataset by varying the proportion of available training samples. The approach is also validated on a small internal non-small cell lung cancer (NSCLC) dataset for gross tumor volume (GTV) segmentation in cone beam CT (CBCT), where it achieves a 4-5% improvement over the baseline.

CT Segmentation Chest Retrospective Clinical In Silico

TextSAM-EUS: Text Prompt Learning for SAM to Accurately Segment Pancreatic Tumor in Endoscopic Ultrasound

Pascal Spiegler, Taha Koleilat, Arash Harirpoush, Corey S. Miller, Hassan Rivaz, Marta Kersten-Oertel, Yiming Xiao

•preprint•Jul 24 2025

Pancreatic cancer carries a poor prognosis and relies on endoscopic ultrasound (EUS) for targeted biopsy and radiotherapy. However, the speckle noise, low contrast, and unintuitive appearance of EUS make segmentation of pancreatic tumors with fully supervised deep learning (DL) models both error-prone and dependent on large, expert-curated annotation datasets. To address these challenges, we present TextSAM-EUS, a novel, lightweight, text-driven adaptation of the Segment Anything Model (SAM) that requires no manual geometric prompts at inference. Our approach leverages text prompt learning (context optimization) through the BiomedCLIP text encoder in conjunction with a LoRA-based adaptation of SAM's architecture to enable automatic pancreatic tumor segmentation in EUS, tuning only 0.86% of the total parameters. On the public Endoscopic Ultrasound Database of the Pancreas, TextSAM-EUS with automatic prompts attains 82.69% Dice and 85.28% normalized surface distance (NSD), and with manual geometric prompts reaches 83.10% Dice and 85.70% NSD, outperforming both existing state-of-the-art (SOTA) supervised DL models and foundation models (e.g., SAM and its variants). As the first attempt to incorporate prompt learning in SAM-based medical image segmentation, TextSAM-EUS offers a practical option for efficient and robust automatic EUS segmentation. Our code will be publicly available upon acceptance.

Ultrasound Segmentation Abdominal Methodology In Silico GenAI Open Code

Information Entropy-Based Framework for Quantifying Tortuosity in Meibomian Gland Uneven Atrophy

Kesheng Wang, Xiaoyu Chen, Chunlei He, Fenfen Li, Xinxin Yu, Dexing Kong, Shoujun Huang, Qi Dai

•preprint•Jul 24 2025

In the medical image analysis field, precise quantification of curve tortuosity plays a critical role in the auxiliary diagnosis and pathological assessment of various diseases. In this study, we propose a novel framework for tortuosity quantification and demonstrate its effectiveness through the evaluation of meibomian gland atrophy uniformity,serving as a representative application scenario. We introduce an information entropy-based tortuosity quantification framework that integrates probability modeling with entropy theory and incorporates domain transformation of curve data. Unlike traditional methods such as curvature or arc-chord ratio, this approach evaluates the tortuosity of a target curve by comparing it to a designated reference curve. Consequently, it is more suitable for tortuosity assessment tasks in medical data where biologically plausible reference curves are available, providing a more robust and objective evaluation metric without relying on idealized straight-line comparisons. First, we conducted numerical simulation experiments to preliminarily assess the stability and validity of the method. Subsequently, the framework was applied to quantify the spatial uniformity of meibomian gland atrophy and to analyze the difference in this uniformity between \textit{Demodex}-negative and \textit{Demodex}-positive patient groups. The results demonstrated a significant difference in tortuosity-based uniformity between the two groups, with an area under the curve of 0.8768, sensitivity of 0.75, and specificity of 0.93. These findings highlight the clinical utility of the proposed framework in curve tortuosity analysis and its potential as a generalizable tool for quantitative morphological evaluation in medical diagnostics.

OCT Segmentation Methodology In Silico Academic Lab

Direct Dual-Energy CT Material Decomposition using Model-based Denoising Diffusion Model

Hang Xu, Alexandre Bousse, Alessandro Perelli

•preprint•Jul 24 2025

Dual-energy X-ray Computed Tomography (DECT) constitutes an advanced technology which enables automatic decomposition of materials in clinical images without manual segmentation using the dependency of the X-ray linear attenuation with energy. However, most methods perform material decomposition in the image domain as a post-processing step after reconstruction but this procedure does not account for the beam-hardening effect and it results in sub-optimal results. In this work, we propose a deep learning procedure called Dual-Energy Decomposition Model-based Diffusion (DEcomp-MoD) for quantitative material decomposition which directly converts the DECT projection data into material images. The algorithm is based on incorporating the knowledge of the spectral DECT model into the deep learning training loss and combining a score-based denoising diffusion learned prior in the material image domain. Importantly the inference optimization loss takes as inputs directly the sinogram and converts to material images through a model-based conditional diffusion model which guarantees consistency of the results. We evaluate the performance with both quantitative and qualitative estimation of the proposed DEcomp-MoD method on synthetic DECT sinograms from the low-dose AAPM dataset. Finally, we show that DEcomp-MoD outperform state-of-the-art unsupervised score-based model and supervised deep learning networks, with the potential to be deployed for clinical diagnosis.

CT Segmentation Methodology In Silico GenAI

Exploring the interplay of label bias with subgroup size and separability: A case study in mammographic density classification

Emma A. M. Stanley, Raghav Mehta, Mélanie Roschewitz, Nils D. Forkert, Ben Glocker

•preprint•Jul 24 2025

Systematic mislabelling affecting specific subgroups (i.e., label bias) in medical imaging datasets represents an understudied issue concerning the fairness of medical AI systems. In this work, we investigated how size and separability of subgroups affected by label bias influence the learned features and performance of a deep learning model. Therefore, we trained deep learning models for binary tissue density classification using the EMory BrEast imaging Dataset (EMBED), where label bias affected separable subgroups (based on imaging manufacturer) or non-separable "pseudo-subgroups". We found that simulated subgroup label bias led to prominent shifts in the learned feature representations of the models. Importantly, these shifts within the feature space were dependent on both the relative size and the separability of the subgroup affected by label bias. We also observed notable differences in subgroup performance depending on whether a validation set with clean labels was used to define the classification threshold for the model. For instance, with label bias affecting the majority separable subgroup, the true positive rate for that subgroup fell from 0.898, when the validation set had clean labels, to 0.518, when the validation set had biased labels. Our work represents a key contribution toward understanding the consequences of label bias on subgroup fairness in medical imaging AI.

Mammography Classification Breast Methodology In Silico Ethics

DGEAHorNet: high-order spatial interaction network with dual cross global efficient attention for medical image segmentation.

Peng H, An X, Chen X, Chen Z

•papers•Jul 24 2025

Medical image segmentation is a complex and challenging task, which aims to accurately segment various structures or abnormal regions in medical images. However, obtaining accurate segmentation results is difficult because of the great uncertainty in the shape, location, and scale of the target region. To address these challenges, we propose a higher-order spatial interaction framework with dual cross global efficient attention (DGEAHorNet), which employs a neural network architecture based on recursive gate convolution to adequately extract multi-scale contextual information from images. Specifically, a Dual Cross-Attentions (DCA) is added to the skip connection that can effectively blend multi-stage encoder features and narrow the semantic gap. In the bottleneck stage, global channel spatial attention module (GCSAM) is used to extract image global information. To obtain better feature representation, we feed the output from the GCSAM into the multi-branch dense layer (SENetV2) for excitation. Furthermore, we adopt Depthwise Over-parameterized Convolutional Layer (DO-Conv) in order to replace the common convolutional layer in the input and output part of our network, then add Efficient Attention (EA) to diminish computational complexity and enhance our model's performance. For evaluating the effectiveness of our proposed DGEAHorNet, we conduct comprehensive experiments on four publicly-available datasets, and achieving 0.9320, 0.9337, 0.9312 and 0.7799 in Dice similarity coefficient on ISIC2018, ISIC2017, CVC-ClinicDB and HRF respectively. Our results show that DGEAHorNet has better performance compared with advanced methods. The code is publicly available at https://github.com/penghaixin/mymodel .

Mixed Modality Segmentation Methodology In Silico Academic Lab Open Code

Artificial intelligence for multi-time-point arterial phase contrast-enhanced MRI profiling to predict prognosis after transarterial chemoembolization in hepatocellular carcinoma.

Yao L, Adwan H, Bernatz S, Li H, Vogl TJ

•papers•Jul 24 2025

Contrast-enhanced magnetic resonance imaging (CE-MRI) monitoring across multiple time points is critical for optimizing hepatocellular carcinoma (HCC) prognosis during transarterial chemoembolization (TACE) treatment. The aim of this retrospective study is to develop and validate an artificial intelligence (AI)-powered models utilizing multi-time-point arterial phase CE-MRI data for HCC prognosis stratification in TACE patients. A total of 543 individual arterial phase CE-MRI scans from 181 HCC patients were retrospectively collected in this study. All patients underwent TACE and longitudinal arterial phase CE-MRI assessments at three time points: prior to treatment, and following the first and second TACE sessions. Among them, 110 patients received TACE monotherapy, while the remaining 71 patients underwent TACE in combination with microwave ablation (MWA). All images were subjected to standardized preprocessing procedures. We developed an end-to-end deep learning model, ProgSwin-UNETR, based on the Swin Transformer architecture, to perform four-class prognosis stratification directly from input imaging data. The model was trained using multi-time-point arterial phase CE-MRI data and evaluated via fourfold cross-validation. Classification performance was assessed using the area under the receiver operating characteristic curve (AUC). For comparative analysis, we benchmarked performance against traditional radiomics-based classifiers and the mRECIST criteria. Prognostic utility was further assessed using Kaplan-Meier (KM) survival curves. Additionally, multivariate Cox proportional hazards regression was performed as a post hoc analysis to evaluate the independent and complementary prognostic value of the model outputs and clinical variables. GradCAM + + was applied to visualize the imaging regions contributing most to model prediction. The ProgSwin-UNETR model achieved an accuracy of 0.86 and an AUC of 0.92 (95% CI: 0.90-0.95) for the four-class prognosis stratification task, outperforming radiomic models across all risk groups. Furthermore, KM survival analyses were performed using three different approaches-AI model, radiomics-based classifiers, and mRECIST criteria-to stratify patients by risk. Of the three approaches, only the AI-based ProgSwin-UNETR model achieved statistically significant risk stratification across the entire cohort and in both TACE-alone and TACE + MWA subgroups (p < 0.005). In contrast, the mRECIST and radiomics models did not yield significant survival differences across subgroups (p > 0.05). Multivariate Cox regression analysis further demonstrated that the model was a robust independent prognostic factor (p = 0.01), effectively stratifying patients into four distinct risk groups (Class 0 to Class 3) with Log(HR) values of 0.97, 0.51, -0.53, and -0.92, respectively. Additionally, GradCAM + + visualizations highlighted critical regional features contributing to prognosis prediction, providing interpretability of the model. ProgSwin-UNETR can well predict the various risk groups of HCC patients undergoing TACE therapy and can further be applied for personalized prediction.

MRI Classification Abdominal Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Filter Papers

Tags

UniSegDiff: Boosting Unified Lesion Segmentation via a Staged Diffusion Model

Q-Former Autoencoder: A Modern Framework for Medical Anomaly Detection

LEAF: Latent Diffusion with Efficient Encoder Distillation for Aligned Features in Medical Image Segmentation

Differential-UMamba: Rethinking Tumor Segmentation Under Limited Data Scenarios

TextSAM-EUS: Text Prompt Learning for SAM to Accurately Segment Pancreatic Tumor in Endoscopic Ultrasound

Information Entropy-Based Framework for Quantifying Tortuosity in Meibomian Gland Uneven Atrophy

Direct Dual-Energy CT Material Decomposition using Model-based Denoising Diffusion Model

Exploring the interplay of label bias with subgroup size and separability: A case study in mammographic density classification

DGEAHorNet: high-order spatial interaction network with dual cross global efficient attention for medical image segmentation.

Artificial intelligence for multi-time-point arterial phase contrast-enhanced MRI profiling to predict prognosis after transarterial chemoembolization in hepatocellular carcinoma.

Ready to Sharpen Your Edge?