Latest Papers on Radiology AI. Order: Best Match, Limit: 10.

Do Edges Matter? Investigating Edge-Enhanced Pre-Training for Medical Image Segmentation

Paul Zaha, Lars Böcking, Simeon Allmendinger, Leopold Müller, Niklas Kühl

•preprint•Aug 4 2025

Medical image segmentation is crucial for disease diagnosis and treatment planning, yet developing robust segmentation models often requires substantial computational resources and large datasets. Existing research shows that pre-trained and finetuned foundation models can boost segmentation performance. However, questions remain about how particular image preprocessing steps may influence segmentation performance across different medical imaging modalities. In particular, edges-abrupt transitions in pixel intensity-are widely acknowledged as vital cues for object boundaries but have not been systematically examined in the pre-training of foundation models. We address this gap by investigating to which extend pre-training with data processed using computationally efficient edge kernels, such as kirsch, can improve cross-modality segmentation capabilities of a foundation model. Two versions of a foundation model are first trained on either raw or edge-enhanced data across multiple medical imaging modalities, then finetuned on selected raw subsets tailored to specific medical modalities. After systematic investigation using the medical domains Dermoscopy, Fundus, Mammography, Microscopy, OCT, US, and XRay, we discover both increased and reduced segmentation performance across modalities using edge-focused pre-training, indicating the need for a selective application of this approach. To guide such selective applications, we propose a meta-learning strategy. It uses standard deviation and image entropy of the raw image to choose between a model pre-trained on edge-enhanced or on raw data for optimal performance. Our experiments show that integrating this meta-learning layer yields an overall segmentation performance improvement across diverse medical imaging tasks by 16.42% compared to models pre-trained on edge-enhanced data only and 19.30% compared to models pre-trained on raw data only.

Mixed Modality Segmentation Methodology In Silico Academic Lab Benchmark SOTA

Explainable AI Methods for Neuroimaging: Systematic Failures of Common Tools, the Need for Domain-Specific Validation, and a Proposal for Safe Application

Nys Tjade Siegel, James H. Cole, Mohamad Habes, Stefan Haufe, Kerstin Ritter, Marc-André Schulz

•preprint•Aug 4 2025

Trustworthy interpretation of deep learning models is critical for neuroimaging applications, yet commonly used Explainable AI (XAI) methods lack rigorous validation, risking misinterpretation. We performed the first large-scale, systematic comparison of XAI methods on ~45,000 structural brain MRIs using a novel XAI validation framework. This framework establishes verifiable ground truth by constructing prediction tasks with known signal sources - from localized anatomical features to subject-specific clinical lesions - without artificially altering input images. Our analysis reveals systematic failures in two of the most widely used methods: GradCAM consistently failed to localize predictive features, while Layer-wise Relevance Propagation generated extensive, artifactual explanations that suggest incompatibility with neuroimaging data characteristics. Our results indicate that these failures stem from a domain mismatch, where methods with design principles tailored to natural images require substantial adaptation for neuroimaging data. In contrast, the simpler, gradient-based method SmoothGrad, which makes fewer assumptions about data structure, proved consistently accurate, suggesting its conceptual simplicity makes it more robust to this domain shift. These findings highlight the need for domain-specific adaptation and validation of XAI methods, suggest that interpretations from prior neuroimaging studies using standard XAI methodology warrant re-evaluation, and provide urgent guidance for practical application of XAI in neuroimaging.

MRI Classification Neurological Methodology In Silico Ethics Reproducibility

Conditional Diffusion Model with Anatomical-Dose Dual Constraints for End-to-End Multi-Tumor Dose Prediction

Hui Xie, Haiqin Hu, Lijuan Ding, Qing Li, Yue Sun, Tao Tan

•preprint•Aug 4 2025

Radiotherapy treatment planning often relies on time-consuming, trial-and-error adjustments that heavily depend on the expertise of specialists, while existing deep learning methods face limitations in generalization, prediction accuracy, and clinical applicability. To tackle these challenges, we propose ADDiff-Dose, an Anatomical-Dose Dual Constraints Conditional Diffusion Model for end-to-end multi-tumor dose prediction. The model employs LightweightVAE3D to compress high-dimensional CT data and integrates multimodal inputs, including target and organ-at-risk (OAR) masks and beam parameters, within a progressive noise addition and denoising framework. It incorporates conditional features via a multi-head attention mechanism and utilizes a composite loss function combining MSE, conditional terms, and KL divergence to ensure both dosimetric accuracy and compliance with clinical constraints. Evaluation on a large-scale public dataset (2,877 cases) and three external institutional cohorts (450 cases in total) demonstrates that ADDiff-Dose significantly outperforms traditional baselines, achieving an MAE of 0.101-0.154 (compared to 0.316 for UNet and 0.169 for GAN models), a DICE coefficient of 0.927 (a 6.8% improvement), and limiting spinal cord maximum dose error to within 0.1 Gy. The average plan generation time per case is reduced to 22 seconds. Ablation studies confirm that the structural encoder enhances compliance with clinical dose constraints by 28.5%. To our knowledge, this is the first study to introduce a conditional diffusion model framework for radiotherapy dose prediction, offering a generalizable and efficient solution for automated treatment planning across diverse tumor sites, with the potential to substantially reduce planning time and improve clinical workflow efficiency.

CT Registration Retrospective Clinical In Silico Academic Lab Breakthrough

Open-radiomics: a collection of standardized datasets and a technical protocol for reproducible radiomics machine learning pipelines.

Namdar K, Wagner MW, Ertl-Wagner BB, Khalvati F

•papers•Aug 4 2025

As an important branch of machine learning pipelines in medical imaging, radiomics faces two major challenges namely reproducibility and accessibility. In this work, we introduce open-radiomics, a set of radiomics datasets along with a comprehensive radiomics pipeline based on our proposed technical protocol to investigate the effects of radiomics feature extraction on the reproducibility of the results. We curated large-scale radiomics datasets based on three open-source datasets; BraTS 2020 for high-grade glioma (HGG) versus low-grade glioma (LGG) classification and survival analysis, BraTS 2023 for O6-methylguanine-DNA methyltransferase (MGMT) classification, and non-small cell lung cancer (NSCLC) survival analysis from the Cancer Imaging Archive (TCIA). We used the BraTS 2020 open-source Magnetic Resonance Imaging (MRI) dataset to demonstrate how our proposed technical protocol could be utilized in radiomics-based studies. The cohort includes 369 adult patients with brain tumors (76 LGG, and 293 HGG). Using PyRadiomics library for LGG vs. HGG classification, we created 288 radiomics datasets; the combinations of 4 MRI sequences, 3 binWidths, 6 image normalization methods, and 4 tumor subregions. We used Random Forest classifiers, and for each radiomics dataset, we repeated the training-validation-test (60%/20%/20%) experiment with different data splits and model random states 100 times (28,800 test results) and calculated the Area Under the Receiver Operating Characteristic Curve (AUROC). Unlike binWidth and image normalization, the tumor subregion and imaging sequence significantly affected performance of the models. T1 contrast-enhanced sequence and the union of Necrotic and the non-enhancing tumor core subregions resulted in the highest AUROCs (average test AUROC 0.951, 95% confidence interval of (0.949, 0.952)). Although several settings and data splits (28 out of 28800) yielded test AUROC of 1, they were irreproducible. Our experiments demonstrate the sources of variability in radiomics pipelines (e.g., tumor subregion) can have a significant impact on the results, which may lead to superficial perfect performances that are irreproducible. Not applicable.

MRI Classification Neurological Dataset Release In Silico Academic Lab Open Dataset Reproducibility

Can Machine Learning Predict Metastatic Sites in Pancreatic Ductal Adenocarcinoma? A Radiomic Analysis.

Spoto F, De Robertis R, Cardobi N, Garofano A, Messineo L, Lucin E, Milella M, D'Onofrio M

•papers•Aug 4 2025

Pancreatic ductal adenocarcinoma (PDAC) exhibits high metastatic potential, with distinct prognoses based on metastatic sites. Radiomics enables quantitative imaging analysis for predictive modeling. To evaluate the feasibility of radiomic models in predicting PDAC metastatic patterns, specifically distinguishing between hepatic and pulmonary metastases. This retrospective study included 115 PDAC patients with either liver (n = 94) or lung (n = 21) metastases. Radiomic features were extracted from pancreatic arterial and venous phase CT scans of primary tumors using PyRadiomics. Two radiologists independently segmented tumors for inter-reader reliability assessment. Features with ICC > 0.9 underwent LASSO regularization for feature selection. Class imbalance was addressed using SMOTE and class weighting. Model performance was evaluated using fivefold cross-validation and bootstrap resampling. The multivariate logistic regression model achieved an AUC-ROC of 0.831 (95% CI: 0.752-0.910). At the optimal threshold, sensitivity was 0.762 (95% CI: 0.659-0.865) and specificity was 0.787 (95% CI: 0.695-0.879). The negative predictive value for lung metastases was 0.810 (95% CI: 0.734-0.886). LargeDependenceEmphasis showed a trend toward significance (p = 0.0566) as a discriminative feature. Precision was 0.842, recall 0.762, and F1 score 0.800. Radiomic analysis of primary pancreatic tumors demonstrates potential for predicting hepatic versus pulmonary metastatic patterns. The high negative predictive value for lung metastases may support clinical decision-making. External validation is essential before clinical implementation. These findings from a single-center study require confirmation in larger, multicenter cohorts.

CT Classification Abdominal Retrospective Clinical In Silico Academic Lab

Machine learning of whole-brain resting-state fMRI signatures for individualized grading of frontal gliomas.

Hu Y, Cao X, Chen H, Geng D, Lv K

•papers•Aug 4 2025

Accurate preoperative grading of gliomas is critical for therapeutic planning and prognostic evaluation. We developed a noninvasive machine learning model leveraging whole-brain resting-state functional magnetic resonance imaging (rs-fMRI) biomarkers to discriminate high-grade (HGGs) and low-grade gliomas (LGGs) in the frontal lobe. This retrospective study included 138 patients (78 LGGs, 60 HGGs) with left frontal gliomas. A total of 7134 features were extracted from the mean amplitude of low-frequency fluctuation (mALFF), mean fractional ALFF, mean percentage amplitude of fluctuation (mPerAF), mean regional homogeneity (mReHo) maps and resting-state functional connectivity (RSFC) matrix. Twelve predictive features were selected through Mann-Whitney U test, correlation analysis and least absolute shrinkage and selection operator method. The patients were stratified and randomized into the training and testing datasets with a 7:3 ratio. The logical regression, random forest, support vector machine (SVM) and adaptive boosting algorithms were used to establish models. The model performance was evaluated using area under the receiver operating characteristic curve, accuracy, sensitivity, and specificity. The selected 12 features included 7 RSFC features, 4 mPerAF features, and 1 mReHo feature. Based on these features, the model was established using the SVM had an optimal performance. The accuracy in the training and testing datasets was 0.957 and 0.727, respectively. The area under the receiver operating characteristic curves was 0.972 and 0.799, respectively. Our whole-brain rs-fMRI radiomics approach provides an objective tool for preoperative glioma stratification. The biological interpretability of selected features reflects distinct neuroplasticity patterns between LGGs and HGGs, advancing understanding of glioma-network interactions.

MRI Classification Neurological Retrospective Clinical In Silico Academic Lab

Enhanced detection of ovarian cancer using AI-optimized 3D CNNs for PET/CT scan analysis.

Sadeghi MH, Sina S, Faghihi R, Alavi M, Giammarile F, Omidi H

•papers•Aug 4 2025

This study investigates how deep learning (DL) can enhance ovarian cancer diagnosis and staging using large imaging datasets. Specifically, we compare six conventional convolutional neural network (CNN) architectures-ResNet, DenseNet, GoogLeNet, U-Net, VGG, and AlexNet-with OCDA-Net, an enhanced model designed for [<sup>18</sup>F]FDG PET image analysis. The OCDA-Net, an advancement on the ResNet architecture, was thoroughly compared using randomly split datasets of training (80%), validation (10%), and test (10%) images. Trained over 100 epochs, OCDA-Net achieved superior diagnostic classification with an accuracy of 92%, and staging results of 94%, supported by robust precision, recall, and F-measure metrics. Grad-CAM ++ heat-maps confirmed that the network attends to hyper-metabolic lesions, supporting clinical interpretability. Our findings show that OCDA-Net outperforms existing CNN models and has strong potential to transform ovarian cancer diagnosis and staging. The study suggests that implementing these DL models in clinical practice could ultimately improve patient prognoses. Future research should expand datasets, enhance model interpretability, and validate these models in clinical settings.

Mixed Modality Classification Abdominal Methodology In Silico Academic Lab Benchmark SOTA

Digital Twin Technology In Radiology.

Aghamiri SS, Amin R, Isavand P, Vahdati S, Zeinoddini A, Kitamura FC, Moy L, Kline T

•papers•Aug 4 2025

A digital twin is a computational model that provides a virtual representation of a specific physical object, system, or process and predicts its behavior at future time points. These simulation models form computational profiles for new diagnosis and prevention models. The digital twin is a concept borrowed from engineering. However, the rapid evolution of this technology has extended its application across various industries. In recent years, digital twins in healthcare have gained significant traction due to their potential to revolutionize medicine and drug development. In the context of radiology, digital twin technology can be applied in various areas, including optimizing medical device design, improving system performance, facilitating personalized medicine, conducting virtual clinical trials, and educating radiology trainees. Also, radiologic image data is a critical source of patient-specific measures that play a role in generating advanced intelligent digital twins. Generating a practical digital twin faces several challenges, including data availability, computational techniques, validation frameworks, and uncertainty quantification, all of which require collaboration among engineers, healthcare providers, and stakeholders. This review focuses on recent trends in digital twin technology and its intersection with radiology by reviewing applications, technological advancements, and challenges that need to be addressed for successful implementation in the field.

Review Concept

Multimodal deep learning model for prognostic prediction in cervical cancer receiving definitive radiotherapy: a multi-center study.

Wang W, Yang G, Liu Y, Wei L, Xu X, Zhang C, Pan Z, Liang Y, Yang B, Qiu J, Zhang F, Hou X, Hu K, Liang X

•papers•Aug 4 2025

For patients with locally advanced cervical cancer (LACC), precise survival prediction models could guide personalized treatment. We developed and validated CerviPro, a deep learning-based multimodal prognostic model, to predict disease-free survival (DFS) in 1018 patients with LACC receiving definitive radiotherapy. The model integrates pre- and post-treatment CT imaging, handcrafted radiomic features, and clinical variables. CerviPro demonstrated robust predictive performance in the internal validation cohort (C-index 0.81), and external validation cohorts (C-index 0.70&0.66), significantly stratifying patients into distinct high- and low-risk DFS groups. Multimodal feature fusion consistently outperformed models based on single feature categories (clinical data, imaging, or radiomics alone), highlighting the synergistic value of integrating diverse data sources. By integrating multimodal data to predict DFS and recurrence risk, CerviPro provides a clinically valuable prognostic tool for LACC, offering the potential to guide personalized treatment strategies.

CT Classification Abdominal Retrospective Clinical In Silico Academic Lab

Retrospective evaluation of interval breast cancer screening mammograms by radiologists and AI.

Subelack J, Morant R, Blum M, Gräwingholt A, Vogel J, Geissler A, Ehlig D

•papers•Aug 4 2025

To determine whether an AI system can identify breast cancer risk in interval breast cancer (IBC) screening mammograms. IBC screening mammograms from a Swiss screening program were retrospectively analyzed by radiologists/an AI system. Radiologists determined whether the IBC mammogram showed human visible signs of breast cancer (potentially missed IBCs) or not (IBCs without retrospective abnormalities). The AI system provided a case score and a prognostic risk category per mammogram. 119 IBC cases (mean age 57.3 (5.4)) were available with complete retrospective evaluations by radiologists/the AI system. 82 (68.9%) were classified as IBCs without retrospective abnormalities and 37 (31.1%) as potentially missed IBCs. 46.2% of all IBCs received a case score ≥ 25, 25.2% ≥ 50, and 13.4% ≥ 75. Of the 25.2% of the IBCs ≥ 50 (vs. 13.4% of a no breast cancer population), 45.2% had not been discussed during a consensus conference, reflecting 11.4% of all IBC cases. The potentially missed IBCs received significantly higher case scores and risk classifications than IBCs without retrospective abnormalities (case score mean: 54.1 vs. 23.1; high risk: 48.7% vs. 14.7%; p < 0.05). 13.4% of the IBCs without retrospective abnormalities received a case score ≥ 50, of which 62.5% had not been discussed during a consensus conference. An AI system can identify IBC screening mammograms with a higher risk for breast cancer, particularly in potentially missed IBCs but also in some IBCs without retrospective abnormalities where radiologists did not see anything, indicating its ability to improve mammography screening quality. Question AI presents a promising opportunity to enhance breast cancer screening in general, but evidence is missing regarding its ability to reduce interval breast cancers. Findings The AI system detected a high risk of breast cancer in most interval breast cancer screening mammograms where radiologists retrospectively detected abnormalities. Clinical relevance Utilization of an AI system in mammography screening programs can identify breast cancer risk in many interval breast cancer screening mammograms and thus potentially reduce the number of interval breast cancers.

Mammography Classification Breast Retrospective Clinical In Silico

Do Edges Matter? Investigating Edge-Enhanced Pre-Training for Medical Image Segmentation

Explainable AI Methods for Neuroimaging: Systematic Failures of Common Tools, the Need for Domain-Specific Validation, and a Proposal for Safe Application

Conditional Diffusion Model with Anatomical-Dose Dual Constraints for End-to-End Multi-Tumor Dose Prediction

Open-radiomics: a collection of standardized datasets and a technical protocol for reproducible radiomics machine learning pipelines.

Can Machine Learning Predict Metastatic Sites in Pancreatic Ductal Adenocarcinoma? A Radiomic Analysis.

Machine learning of whole-brain resting-state fMRI signatures for individualized grading of frontal gliomas.

Enhanced detection of ovarian cancer using AI-optimized 3D CNNs for PET/CT scan analysis.

Digital Twin Technology In Radiology.

Multimodal deep learning model for prognostic prediction in cervical cancer receiving definitive radiotherapy: a multi-center study.

Retrospective evaluation of interval breast cancer screening mammograms by radiologists and AI.

Ready to Sharpen Your Edge?