Latest Papers on Radiology AI. Tags: In Silico

Enhanced detection of ovarian cancer using AI-optimized 3D CNNs for PET/CT scan analysis.

Sadeghi MH, Sina S, Faghihi R, Alavi M, Giammarile F, Omidi H

•papers•Aug 4 2025

This study investigates how deep learning (DL) can enhance ovarian cancer diagnosis and staging using large imaging datasets. Specifically, we compare six conventional convolutional neural network (CNN) architectures-ResNet, DenseNet, GoogLeNet, U-Net, VGG, and AlexNet-with OCDA-Net, an enhanced model designed for [<sup>18</sup>F]FDG PET image analysis. The OCDA-Net, an advancement on the ResNet architecture, was thoroughly compared using randomly split datasets of training (80%), validation (10%), and test (10%) images. Trained over 100 epochs, OCDA-Net achieved superior diagnostic classification with an accuracy of 92%, and staging results of 94%, supported by robust precision, recall, and F-measure metrics. Grad-CAM ++ heat-maps confirmed that the network attends to hyper-metabolic lesions, supporting clinical interpretability. Our findings show that OCDA-Net outperforms existing CNN models and has strong potential to transform ovarian cancer diagnosis and staging. The study suggests that implementing these DL models in clinical practice could ultimately improve patient prognoses. Future research should expand datasets, enhance model interpretability, and validate these models in clinical settings.

Mixed Modality Classification Abdominal Methodology In Silico Academic Lab Benchmark SOTA

Retrospective evaluation of interval breast cancer screening mammograms by radiologists and AI.

Subelack J, Morant R, Blum M, Gräwingholt A, Vogel J, Geissler A, Ehlig D

•papers•Aug 4 2025

To determine whether an AI system can identify breast cancer risk in interval breast cancer (IBC) screening mammograms. IBC screening mammograms from a Swiss screening program were retrospectively analyzed by radiologists/an AI system. Radiologists determined whether the IBC mammogram showed human visible signs of breast cancer (potentially missed IBCs) or not (IBCs without retrospective abnormalities). The AI system provided a case score and a prognostic risk category per mammogram. 119 IBC cases (mean age 57.3 (5.4)) were available with complete retrospective evaluations by radiologists/the AI system. 82 (68.9%) were classified as IBCs without retrospective abnormalities and 37 (31.1%) as potentially missed IBCs. 46.2% of all IBCs received a case score ≥ 25, 25.2% ≥ 50, and 13.4% ≥ 75. Of the 25.2% of the IBCs ≥ 50 (vs. 13.4% of a no breast cancer population), 45.2% had not been discussed during a consensus conference, reflecting 11.4% of all IBC cases. The potentially missed IBCs received significantly higher case scores and risk classifications than IBCs without retrospective abnormalities (case score mean: 54.1 vs. 23.1; high risk: 48.7% vs. 14.7%; p < 0.05). 13.4% of the IBCs without retrospective abnormalities received a case score ≥ 50, of which 62.5% had not been discussed during a consensus conference. An AI system can identify IBC screening mammograms with a higher risk for breast cancer, particularly in potentially missed IBCs but also in some IBCs without retrospective abnormalities where radiologists did not see anything, indicating its ability to improve mammography screening quality. Question AI presents a promising opportunity to enhance breast cancer screening in general, but evidence is missing regarding its ability to reduce interval breast cancers. Findings The AI system detected a high risk of breast cancer in most interval breast cancer screening mammograms where radiologists retrospectively detected abnormalities. Clinical relevance Utilization of an AI system in mammography screening programs can identify breast cancer risk in many interval breast cancer screening mammograms and thus potentially reduce the number of interval breast cancers.

Mammography Classification Breast Retrospective Clinical In Silico

Combined nomogram for differentiating adrenal pheochromocytoma from large-diameter lipid-poor adenoma using multiphase CT radiomics and clinico-radiological features.

Shan Z, Zhang X, Zhang Y, Wang S, Wang J, Shi X, Li L, Li Z, Yang L, Liu H, Li W, Yang J, Yang L

•papers•Aug 4 2025

Adrenal incidentalomas (AIs) are predominantly adrenal adenomas (80%), with a smaller proportion (7%) being pheochromocytomas(PHEO). Adenomas are typically non-functional tumors managed through observation or medication, with some cases requiring surgical removal, which is generally safe. In contrast, PHEO secrete catecholamines, causing severe blood pressure fluctuations, making surgical resection the only treatment option. Without adequate preoperative preparation, perioperative mortality risk is significantly high.A specialized adrenal CT scanning protocol is recommended to differentiate between these tumor types. However, distinguishing patients with similar washout characteristics remains challenging, and concerns about efficiency, cost, and risk limit its feasibility. Recently, radiomics has demonstrated efficacy in identifying molecular-level differences in tumor cells, including adrenal tumors. This study develops a combined nomogram model, integrating key clinical-radiological and radiomic features from multiphase CT, to enhance accuracy in distinguishing pheochromocytoma from large-diameter lipid-poor adrenal adenoma (LP-AA). A retrospective analysis was conducted on 202 patients with pathologically confirmed adrenal PHEO and large-diameter LP-AA from three tertiary care centers. Key clinico-radiological and radiomics features were selected to construct models: a clinico-radiological model, a radiomics model, and a combined nomogram model for predicting these two tumor types. Model performance and robustness were evaluated using external validation, calibration curve analysis, machine learning techniques, and Delong's test. Additionally, the Hosmer-Lemeshow test, decision curve analysis, and five-fold cross-validation were employed to assess the clinical translational potential of the combined nomogram model. All models demonstrated high diagnostic performance, with AUC values exceeding 0.8 across all cohorts, confirming their reliability. The combined nomogram model exhibited the highest diagnostic accuracy, with AUC values of 0.994, 0.979, and 0.945 for the training, validation, and external test cohorts, respectively. Notably, the unenhanced combined nomogram model was not significantly inferior to the three-phase combined nomogram model (p > 0.05 in the validation and test cohorts; p = 0.049 in the training cohort). The combined nomogram model reliably distinguishes between PHEO and LP-AA, shows strong clinical translational potential, and may reduce the need for contrast-enhanced CT scans. Not applicable.

CT Classification Abdominal Retrospective Clinical In Silico

CT-Based 3D Super-Resolution Radiomics for the Differential Diagnosis of Brucella <i>vs.</i> Tuberculous Spondylitis using Deep Learning.

Wang K, Qi L, Li J, Zhang M, Du H

•papers•Aug 4 2025

This study aims to improve the accuracy of distinguishing Tuberculous Spondylitis (TBS) from Brucella Spondylitis (BS) by developing radiomics models using Deep Learning and CT images enhanced with Super-Resolution (SR). A total of 94 patients diagnosed with BS or TBS were randomly divided into training (n=65) and validation (n=29) groups in a 7:3 ratio. In the training set, there were 40 BS and 25 TBS patients, with a mean age of 58.34 ± 12.53 years. In the validation set, there were 17 BS and 12 TBS patients, with a mean age of 58.48 ± 12.29 years. Standard CT images were enhanced using SR, improving spatial resolution and image quality. The lesion regions (ROIs) were manually segmented, and radiomics features were extracted. ResNet18 and ResNet34 were used for deep learning feature extraction and model training. Four multi-layer perceptron (MLP) models were developed: clinical, radiomics (Rad), deep learning (DL), and a combined model. Model performance was assessed using five-fold cross-validation, ROC, and decision curve analysis (DCA). Statistical significance was assessed, with key clinical and imaging features showing significant differences between TBS and BS (e.g., gender, p=0.0038; parrot beak appearance, p<0.001; dead bone, p<0.001; deformities of the spinal posterior process, p=0.0044; psoas abscess, p<0.001). The combined model outperformed others, achieving the highest AUC (0.952), with ResNet34 and SR-enhanced images further boosting performance. Sensitivity reached 0.909, and Specificity was 0.941. DCA confirmed clinical applicability. The integration of SR-enhanced CT imaging and deep learning radiomics appears to improve diagnostic differentiation between BS and TBS. The combined model, especially when using ResNet34 and GAN-based super-resolution, demonstrated better predictive performance. High-resolution imaging may facilitate better lesion delineation and more robust feature extraction. Nevertheless, further validation with larger, multicenter cohorts is needed to confirm generalizability and reduce potential bias from retrospective design and imaging heterogeneity. This study suggests that integrating Deep Learning Radiomics with Super-Resolution may improve the differentiation between TBS and BS compared to standard CT imaging. However, prospective multi-center studies are necessary to validate its clinical applicability.

CT Classification Musculoskeletal Retrospective Clinical In Silico Academic Lab

Early prediction of proton therapy dose distributions and DVHs for hepatocellular carcinoma using contour-based CNN models from diagnostic CT and MRI.

Rachi T, Tochinai T

•papers•Aug 4 2025

Proton therapy is commonly used for treating hepatocellular carcinoma (HCC); however, its feasibility can be challenging to assess in large tumors or those adjacent to critical organs at risk (OARs), which are typically assessed only after planning computed tomography (CT) acquisition. This study aimed to predict proton dose distributions using diagnostic CT (dCT) and diagnostic MRI (dMRI) with a convolutional neural network (CNN), enabling early treatment feasibility assessments. Dose distributions and dose-volume histograms (DVHs) were calculated for 118 patients with HCC using intensity-modulated proton therapy (IMPT) and passive proton therapy. A CPU-based CNN model was used to predict DVHs and 3D dose distributions from diagnostic images. Prediction accuracy was evaluated using mean absolute error (MAE), mean squared error (MSE), peak signal-to-noise ratio (PSNR), structural similarity index (SSIM), and gamma passing rate with a 3 mm/3% criterion. The predicted DVHs and dose distributions showed high agreement with actual values. MAE remained below 3.0%, with passive techniques achieving 1.2-1.8%. MSE was below 0.004 in all cases. PSNR ranged from 24 to 28 dB, and SSIM exceeded 0.94 in most conditions. Gamma passing rates averaged 82-83% for IMPT and 92-93% for passive techniques. The model achieved comparable accuracy when using dMRI and dCT. This study demonstrates that early dose distribution prediction from diagnostic imaging is feasible and accurate using a lightweight CNN model. Despite anatomical variability between diagnostic and planning images, this approach provides timely insights into treatment feasibility, potentially supporting insurance pre-authorization, reducing unnecessary imaging, and optimizing clinical workflows for HCC proton therapy.

Mixed Modality Registration Abdominal Retrospective Clinical In Silico Academic Lab

Machine learning of whole-brain resting-state fMRI signatures for individualized grading of frontal gliomas.

Hu Y, Cao X, Chen H, Geng D, Lv K

•papers•Aug 4 2025

Accurate preoperative grading of gliomas is critical for therapeutic planning and prognostic evaluation. We developed a noninvasive machine learning model leveraging whole-brain resting-state functional magnetic resonance imaging (rs-fMRI) biomarkers to discriminate high-grade (HGGs) and low-grade gliomas (LGGs) in the frontal lobe. This retrospective study included 138 patients (78 LGGs, 60 HGGs) with left frontal gliomas. A total of 7134 features were extracted from the mean amplitude of low-frequency fluctuation (mALFF), mean fractional ALFF, mean percentage amplitude of fluctuation (mPerAF), mean regional homogeneity (mReHo) maps and resting-state functional connectivity (RSFC) matrix. Twelve predictive features were selected through Mann-Whitney U test, correlation analysis and least absolute shrinkage and selection operator method. The patients were stratified and randomized into the training and testing datasets with a 7:3 ratio. The logical regression, random forest, support vector machine (SVM) and adaptive boosting algorithms were used to establish models. The model performance was evaluated using area under the receiver operating characteristic curve, accuracy, sensitivity, and specificity. The selected 12 features included 7 RSFC features, 4 mPerAF features, and 1 mReHo feature. Based on these features, the model was established using the SVM had an optimal performance. The accuracy in the training and testing datasets was 0.957 and 0.727, respectively. The area under the receiver operating characteristic curves was 0.972 and 0.799, respectively. Our whole-brain rs-fMRI radiomics approach provides an objective tool for preoperative glioma stratification. The biological interpretability of selected features reflects distinct neuroplasticity patterns between LGGs and HGGs, advancing understanding of glioma-network interactions.

MRI Classification Neurological Retrospective Clinical In Silico Academic Lab

Explainable AI Methods for Neuroimaging: Systematic Failures of Common Tools, the Need for Domain-Specific Validation, and a Proposal for Safe Application

Nys Tjade Siegel, James H. Cole, Mohamad Habes, Stefan Haufe, Kerstin Ritter, Marc-André Schulz

•preprint•Aug 4 2025

Trustworthy interpretation of deep learning models is critical for neuroimaging applications, yet commonly used Explainable AI (XAI) methods lack rigorous validation, risking misinterpretation. We performed the first large-scale, systematic comparison of XAI methods on ~45,000 structural brain MRIs using a novel XAI validation framework. This framework establishes verifiable ground truth by constructing prediction tasks with known signal sources - from localized anatomical features to subject-specific clinical lesions - without artificially altering input images. Our analysis reveals systematic failures in two of the most widely used methods: GradCAM consistently failed to localize predictive features, while Layer-wise Relevance Propagation generated extensive, artifactual explanations that suggest incompatibility with neuroimaging data characteristics. Our results indicate that these failures stem from a domain mismatch, where methods with design principles tailored to natural images require substantial adaptation for neuroimaging data. In contrast, the simpler, gradient-based method SmoothGrad, which makes fewer assumptions about data structure, proved consistently accurate, suggesting its conceptual simplicity makes it more robust to this domain shift. These findings highlight the need for domain-specific adaptation and validation of XAI methods, suggest that interpretations from prior neuroimaging studies using standard XAI methodology warrant re-evaluation, and provide urgent guidance for practical application of XAI in neuroimaging.

MRI Classification Neurological Methodology In Silico Ethics Reproducibility

A dual self-attentive transformer U-Net model for precise pancreatic segmentation and fat fraction estimation.

Shanmugam A, Radhabai PR, Kvn K, Imoize AL

•papers•Aug 4 2025

Accurately segmenting the pancreas from abdominal computed tomography (CT) images is crucial for detecting and managing pancreatic diseases, such as diabetes and tumors. Type 2 diabetes and metabolic syndrome are associated with pancreatic fat accumulation. Calculating the fat fraction aids in the investigation of β-cell malfunction and insulin resistance. The most widely used pancreas segmentation technique is a U-shaped network based on deep convolutional neural networks (DCNNs). They struggle to capture long-range biases in an image because they rely on local receptive fields. This research proposes a novel dual Self-attentive Transformer Unet (DSTUnet) model for accurate pancreatic segmentation, addressing this problem. This model incorporates dual self-attention Swin transformers on both the encoder and decoder sides to facilitate global context extraction and refine candidate regions. After segmenting the pancreas using a DSTUnet, a histogram analysis is used to estimate the fat fraction. The suggested method demonstrated excellent performance on the standard dataset, achieving a DSC of 93.7% and an HD of 2.7 mm. The average volume of the pancreas was 92.42, and its fat volume fraction (FVF) was 13.37%.

CT Segmentation Abdominal Methodology In Silico

S-RRG-Bench: Structured Radiology Report Generation with Fine-Grained Evaluation Framework

Yingshu Li, Yunyi Liu, Zhanyu Wang, Xinyu Liang, Lingqiao Liu, Lei Wang, Luping Zhou

•preprint•Aug 4 2025

Radiology report generation (RRG) for diagnostic images, such as chest X-rays, plays a pivotal role in both clinical practice and AI. Traditional free-text reports suffer from redundancy and inconsistent language, complicating the extraction of critical clinical details. Structured radiology report generation (S-RRG) offers a promising solution by organizing information into standardized, concise formats. However, existing approaches often rely on classification or visual question answering (VQA) pipelines that require predefined label sets and produce only fragmented outputs. Template-based approaches, which generate reports by replacing keywords within fixed sentence patterns, further compromise expressiveness and often omit clinically important details. In this work, we present a novel approach to S-RRG that includes dataset construction, model training, and the introduction of a new evaluation framework. We first create a robust chest X-ray dataset (MIMIC-STRUC) that includes disease names, severity levels, probabilities, and anatomical locations, ensuring that the dataset is both clinically relevant and well-structured. We train an LLM-based model to generate standardized, high-quality reports. To assess the generated reports, we propose a specialized evaluation metric (S-Score) that not only measures disease prediction accuracy but also evaluates the precision of disease-specific details, thus offering a clinically meaningful metric for report quality that focuses on elements critical to clinical decision-making and demonstrates a stronger alignment with human assessments. Our approach highlights the effectiveness of structured reports and the importance of a tailored evaluation metric for S-RRG, providing a more clinically relevant measure of report quality.

X-Ray LLM Radiology Report Chest Methodology In Silico Open Dataset Benchmark SOTA GenAI

A Dual Radiomic and Dosiomic Filtering Technique for Locoregional Radiation Pneumonitis Prediction in Breast Cancer Patients

Zhenyu Yang, Qian Chen, Rihui Zhang, Manju Liu, Fengqiu Guo, Minjie Yang, Min Tang, Lina Zhou, Chunhao Wang, Minbin Chen, Fang-Fang Yin

•preprint•Aug 4 2025

Purpose: Radiation pneumonitis (RP) is a serious complication of intensity-modulated radiation therapy (IMRT) for breast cancer patients, underscoring the need for precise and explainable predictive models. This study presents an Explainable Dual-Omics Filtering (EDOF) model that integrates spatially localized dosiomic and radiomic features for voxel-level RP prediction. Methods: A retrospective cohort of 72 breast cancer patients treated with IMRT was analyzed, including 28 who developed RP. The EDOF model consists of two components: (1) dosiomic filtering, which extracts local dose intensity and spatial distribution features from planning dose maps, and (2) radiomic filtering, which captures texture-based features from pre-treatment CT scans. These features are jointly analyzed using the Explainable Boosting Machine (EBM), a transparent machine learning model that enables feature-specific risk evaluation. Model performance was assessed using five-fold cross-validation, reporting area under the curve (AUC), sensitivity, and specificity. Feature importance was quantified by mean absolute scores, and Partial Dependence Plots (PDPs) were used to visualize nonlinear relationships between RP risk and dual-omic features. Results: The EDOF model achieved strong predictive performance (AUC = 0.95 +- 0.01; sensitivity = 0.81 +- 0.05). The most influential features included dosiomic Intensity Mean, dosiomic Intensity Mean Absolute Deviation, and radiomic SRLGLE. PDPs revealed that RP risk increases beyond 5 Gy and rises sharply between 10-30 Gy, consistent with clinical dose thresholds. SRLGLE also captured structural heterogeneity linked to RP in specific lung regions. Conclusion: The EDOF framework enables spatially resolved, explainable RP prediction and may support personalized radiation planning to mitigate pulmonary toxicity.

CT Classification Chest Retrospective Clinical In Silico Academic Lab Ethics

Filter Papers

Tags

Enhanced detection of ovarian cancer using AI-optimized 3D CNNs for PET/CT scan analysis.

Retrospective evaluation of interval breast cancer screening mammograms by radiologists and AI.

Combined nomogram for differentiating adrenal pheochromocytoma from large-diameter lipid-poor adenoma using multiphase CT radiomics and clinico-radiological features.

CT-Based 3D Super-Resolution Radiomics for the Differential Diagnosis of Brucella <i>vs.</i> Tuberculous Spondylitis using Deep Learning.

Early prediction of proton therapy dose distributions and DVHs for hepatocellular carcinoma using contour-based CNN models from diagnostic CT and MRI.

Machine learning of whole-brain resting-state fMRI signatures for individualized grading of frontal gliomas.

Explainable AI Methods for Neuroimaging: Systematic Failures of Common Tools, the Need for Domain-Specific Validation, and a Proposal for Safe Application

A dual self-attentive transformer U-Net model for precise pancreatic segmentation and fat fraction estimation.

S-RRG-Bench: Structured Radiology Report Generation with Fine-Grained Evaluation Framework

A Dual Radiomic and Dosiomic Filtering Technique for Locoregional Radiation Pneumonitis Prediction in Breast Cancer Patients

Ready to Sharpen Your Edge?