Latest Papers on Radiology AI. Tags: In Silico

Machine Learning and MRI-Based Whole-Organ Magnetic Resonance Imaging Score (WORMS): A Novel Approach to Enhancing Genicular Artery Embolization Outcomes in Knee Osteoarthritis.

Dablan A, Özgül H, Arslan MF, Türksayar O, Cingöz M, Mutlu IN, Erdim C, Guzelbey T, Kılıckesmez O

•papers•Aug 4 2025

To evaluate the feasibility of machine learning (ML) models using preprocedural MRI-based Whole-Organ Magnetic Resonance Imaging Score (WORMS) and clinical parameters to predict treatment response after genicular artery embolization in patients with knee osteoarthritis. This retrospective study included 66 patients (72 knees) who underwent GAE between December 2022 and June 2024. Preprocedural assessments included WORMS and Kellgren-Lawrence grading. Clinical response was defined as a ≥ 50% reduction in Visual Analog Scale (VAS) score. Feature selection was performed using recursive feature elimination and correlation analysis. Multiple ML algorithms (Random Forest, Support Vector Machine, Logistic Regression) were trained using stratified fivefold cross-validation. Conventional statistical analyses assessed group differences and correlations. Of 72 knees, 33 (45.8%) achieved a clinically significant response. Responders showed significantly lower WORMSs for cartilage, bone marrow, and total joint damage (p < 0.05). The Random Forest model demonstrated the best performance, with an accuracy of 81.8%, AUC-ROC of 86.2%, sensitivity of 90%, and specificity of 75%. Key predictive features included total WORMS, ligament score, and baseline VAS. Bone marrow score showed the strongest correlation with VAS reduction (r = -0.430, p < 0.001). ML models integrating WORMS and clinical data suggest that greater cartilage loss, bone marrow edema, joint damage, and higher baseline VAS scores may help to identify patients less likely to respond to GAE for knee OA.

MRI Classification Musculoskeletal Retrospective Clinical In Silico Academic Lab

The Use of Artificial Intelligence to Improve Detection of Acute Incidental Pulmonary Emboli.

Kuzo RS, Levin DL, Bratt AK, Walkoff LA, Suman G, Houghton DE

•papers•Aug 4 2025

Incidental pulmonary emboli (IPE) are frequently overlooked by radiologists. Artificial intelligence (AI) algorithms have been developed to aid detection of pulmonary emboli. To measure diagnostic performance of AI compared with prospective interpretation by radiologists. A commercially available AI algorithm was used to retrospectively review 14,453 contrast-enhanced outpatient CT CAP exams in 9171 patients where PE was not clinically suspected. Natural language processing (NLP) searches of reports identified IPE detected prospectively. Thoracic radiologists reviewed all cases read as positive by AI or NLP to confirm IPE and assess the most proximal level of clot and overall clot burden. 1,400 cases read as negative by both the initial radiologist and AI were re-reviewed to assess for additional IPE. Radiologists prospectively detected 218 IPE and AI detected an additional 36 unreported cases. AI missed 30 cases of IPE detected by the radiologist and had 94 false positives. For 36 IPE missed by the radiologist, median clot burden was 1 and 19 were solitary segmental or subsegmental. For 30 IPE missed by AI, one case had large central emboli and the others were small with 23 solitary subsegmental emboli. Radiologist re-review of 1,400 exams interpreted as negative found 8 additional cases of IPE. Compared with radiologists, AI had similar sensitivity but reduced positive predictive value. Our experience indicates that the AI tool is not ready to be used autonomously without human oversight, but a human observer plus AI is better than either alone for detection of incidental pulmonary emboli.

CT Detection Chest Retrospective Clinical In Silico Academic Lab

Do Edges Matter? Investigating Edge-Enhanced Pre-Training for Medical Image Segmentation

Paul Zaha, Lars Böcking, Simeon Allmendinger, Leopold Müller, Niklas Kühl

•preprint•Aug 4 2025

Medical image segmentation is crucial for disease diagnosis and treatment planning, yet developing robust segmentation models often requires substantial computational resources and large datasets. Existing research shows that pre-trained and finetuned foundation models can boost segmentation performance. However, questions remain about how particular image preprocessing steps may influence segmentation performance across different medical imaging modalities. In particular, edges-abrupt transitions in pixel intensity-are widely acknowledged as vital cues for object boundaries but have not been systematically examined in the pre-training of foundation models. We address this gap by investigating to which extend pre-training with data processed using computationally efficient edge kernels, such as kirsch, can improve cross-modality segmentation capabilities of a foundation model. Two versions of a foundation model are first trained on either raw or edge-enhanced data across multiple medical imaging modalities, then finetuned on selected raw subsets tailored to specific medical modalities. After systematic investigation using the medical domains Dermoscopy, Fundus, Mammography, Microscopy, OCT, US, and XRay, we discover both increased and reduced segmentation performance across modalities using edge-focused pre-training, indicating the need for a selective application of this approach. To guide such selective applications, we propose a meta-learning strategy. It uses standard deviation and image entropy of the raw image to choose between a model pre-trained on edge-enhanced or on raw data for optimal performance. Our experiments show that integrating this meta-learning layer yields an overall segmentation performance improvement across diverse medical imaging tasks by 16.42% compared to models pre-trained on edge-enhanced data only and 19.30% compared to models pre-trained on raw data only.

Mixed Modality Segmentation Methodology In Silico Academic Lab Benchmark SOTA

S-RRG-Bench: Structured Radiology Report Generation with Fine-Grained Evaluation Framework

Yingshu Li, Yunyi Liu, Zhanyu Wang, Xinyu Liang, Lingqiao Liu, Lei Wang, Luping Zhou

•preprint•Aug 4 2025

Radiology report generation (RRG) for diagnostic images, such as chest X-rays, plays a pivotal role in both clinical practice and AI. Traditional free-text reports suffer from redundancy and inconsistent language, complicating the extraction of critical clinical details. Structured radiology report generation (S-RRG) offers a promising solution by organizing information into standardized, concise formats. However, existing approaches often rely on classification or visual question answering (VQA) pipelines that require predefined label sets and produce only fragmented outputs. Template-based approaches, which generate reports by replacing keywords within fixed sentence patterns, further compromise expressiveness and often omit clinically important details. In this work, we present a novel approach to S-RRG that includes dataset construction, model training, and the introduction of a new evaluation framework. We first create a robust chest X-ray dataset (MIMIC-STRUC) that includes disease names, severity levels, probabilities, and anatomical locations, ensuring that the dataset is both clinically relevant and well-structured. We train an LLM-based model to generate standardized, high-quality reports. To assess the generated reports, we propose a specialized evaluation metric (S-Score) that not only measures disease prediction accuracy but also evaluates the precision of disease-specific details, thus offering a clinically meaningful metric for report quality that focuses on elements critical to clinical decision-making and demonstrates a stronger alignment with human assessments. Our approach highlights the effectiveness of structured reports and the importance of a tailored evaluation metric for S-RRG, providing a more clinically relevant measure of report quality.

X-Ray LLM Radiology Report Chest Methodology In Silico Open Dataset Benchmark SOTA GenAI

Development and Validation of an Explainable MRI-Based Habitat Radiomics Model for Predicting p53-Abnormal Endometrial Cancer: A Multicentre Feasibility Study.

Jin W, Zhang H, Ning Y, Chen X, Zhang G, Li H, Zhang H

•papers•Aug 4 2025

We developed an MRI-based habitat radiomics model (HRM) to predict p53-abnormal (p53abn) molecular subtypes of endometrial cancer (EC). Patients with pathologically confirmed EC were retrospectively enrolled from three hospitals and categorized into a training cohort (n = 270), test cohort 1 (n = 70), and test cohort 2 (n = 154). The tumour was divided into habitat sub-regions using diffusion-weighted imaging (DWI) and contrast-enhanced (CE) images with the K-means algorithm. Radiomics features were extracted from T1-weighted imaging (T1WI), T2-weighted imaging (T2WI), DWI, and CE images. Three machine learning classifiers-logistic regression, support vector machines, and random forests-were applied to develop predictive models for p53abn EC. Model performance was validated using receiver operating characteristic (ROC) curves, and the model with the best predictive performance was selected as the HRM. A whole-region radiomics model (WRM) was also constructed, and a clinical model (CM) with five clinical features was developed. The SHApley Additive ExPlanations (SHAP) method was used to explain the outputs of the models. DeLong's test evaluated and compared the performance across the cohorts. A total of 1920 habitat radiomics features were considered. Eight features were selected for the HRM, ten for the WRM, and three clinical features for the CM. The HRM achieved the highest AUC: 0.855 (training), 0.769 (test1), and 0.766 (test2). The AUCs of the WRM were 0.707 (training), 0.703 (test1), and 0.738 (test2). The AUCs of the CM were 0.709 (training), 0.641 (test1), and 0.665 (test2). The MRI-based HRM successfully predicted p53abn EC. The results indicate that habitat combined with machine learning, radiomics, and SHAP can effectively predict p53abn EC, providing clinicians with intuitive insights and interpretability regarding the impact of risk factors in the model.

MRI Classification Abdominal Retrospective Clinical In Silico Benchmark SOTA GenAI

A Novel Dual-Output Deep Learning Model Based on InceptionV3 for Radiographic Bone Age and Gender Assessment.

Rayed B, Amasya H, Sezdi M

•papers•Aug 4 2025

Hand-wrist radiographs are used in bone age prediction. Computer-assisted clinical decision support systems offer solutions to the limitations of the radiographic bone age assessment methods. In this study, a multi-output prediction model was designed to predict bone age and gender using digital hand-wrist radiographs. The InceptionV3 architecture was used as the backbone, and the model was trained and tested using the open-access dataset of 2017 RSNA Pediatric Bone Age Challenge. A total of 14,048 samples were divided to training, validation, and testing subsets with the ratio of 7:2:1, and additional specialized convolutional neural network layers were implemented for robust feature management, such as Squeeze-and-Excitation block. The proposed model achieved a mean squared error of approximately 25 and a mean absolute error of 3.1 for predicting bone age. In gender classification, an accuracy of 95% and an area under the curve of 97% were achieved. The intra-class correlation coefficient for the continuous bone age predictions was found to be 0.997, while the Cohen's <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>κ</mi></math> coefficient for the gender predictions was found to be 0.898 ( <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>p</mi> <mo><</mo></mrow> </math> 0.001). The proposed model aims to increase model efficiency by identifying common and discrete features. Based on the results, the proposed algorithm is promising; however, the mid-high-end hardware requirement may be a limitation for its use on local machines in the clinic. The future studies may consider increasing the dataset and simplification of the algorithms.

X-Ray Classification Musculoskeletal Methodology In Silico Academic Lab

Open-radiomics: a collection of standardized datasets and a technical protocol for reproducible radiomics machine learning pipelines.

Namdar K, Wagner MW, Ertl-Wagner BB, Khalvati F

•papers•Aug 4 2025

As an important branch of machine learning pipelines in medical imaging, radiomics faces two major challenges namely reproducibility and accessibility. In this work, we introduce open-radiomics, a set of radiomics datasets along with a comprehensive radiomics pipeline based on our proposed technical protocol to investigate the effects of radiomics feature extraction on the reproducibility of the results. We curated large-scale radiomics datasets based on three open-source datasets; BraTS 2020 for high-grade glioma (HGG) versus low-grade glioma (LGG) classification and survival analysis, BraTS 2023 for O6-methylguanine-DNA methyltransferase (MGMT) classification, and non-small cell lung cancer (NSCLC) survival analysis from the Cancer Imaging Archive (TCIA). We used the BraTS 2020 open-source Magnetic Resonance Imaging (MRI) dataset to demonstrate how our proposed technical protocol could be utilized in radiomics-based studies. The cohort includes 369 adult patients with brain tumors (76 LGG, and 293 HGG). Using PyRadiomics library for LGG vs. HGG classification, we created 288 radiomics datasets; the combinations of 4 MRI sequences, 3 binWidths, 6 image normalization methods, and 4 tumor subregions. We used Random Forest classifiers, and for each radiomics dataset, we repeated the training-validation-test (60%/20%/20%) experiment with different data splits and model random states 100 times (28,800 test results) and calculated the Area Under the Receiver Operating Characteristic Curve (AUROC). Unlike binWidth and image normalization, the tumor subregion and imaging sequence significantly affected performance of the models. T1 contrast-enhanced sequence and the union of Necrotic and the non-enhancing tumor core subregions resulted in the highest AUROCs (average test AUROC 0.951, 95% confidence interval of (0.949, 0.952)). Although several settings and data splits (28 out of 28800) yielded test AUROC of 1, they were irreproducible. Our experiments demonstrate the sources of variability in radiomics pipelines (e.g., tumor subregion) can have a significant impact on the results, which may lead to superficial perfect performances that are irreproducible. Not applicable.

MRI Classification Neurological Dataset Release In Silico Academic Lab Open Dataset Reproducibility

Can Machine Learning Predict Metastatic Sites in Pancreatic Ductal Adenocarcinoma? A Radiomic Analysis.

Spoto F, De Robertis R, Cardobi N, Garofano A, Messineo L, Lucin E, Milella M, D'Onofrio M

•papers•Aug 4 2025

Pancreatic ductal adenocarcinoma (PDAC) exhibits high metastatic potential, with distinct prognoses based on metastatic sites. Radiomics enables quantitative imaging analysis for predictive modeling. To evaluate the feasibility of radiomic models in predicting PDAC metastatic patterns, specifically distinguishing between hepatic and pulmonary metastases. This retrospective study included 115 PDAC patients with either liver (n = 94) or lung (n = 21) metastases. Radiomic features were extracted from pancreatic arterial and venous phase CT scans of primary tumors using PyRadiomics. Two radiologists independently segmented tumors for inter-reader reliability assessment. Features with ICC > 0.9 underwent LASSO regularization for feature selection. Class imbalance was addressed using SMOTE and class weighting. Model performance was evaluated using fivefold cross-validation and bootstrap resampling. The multivariate logistic regression model achieved an AUC-ROC of 0.831 (95% CI: 0.752-0.910). At the optimal threshold, sensitivity was 0.762 (95% CI: 0.659-0.865) and specificity was 0.787 (95% CI: 0.695-0.879). The negative predictive value for lung metastases was 0.810 (95% CI: 0.734-0.886). LargeDependenceEmphasis showed a trend toward significance (p = 0.0566) as a discriminative feature. Precision was 0.842, recall 0.762, and F1 score 0.800. Radiomic analysis of primary pancreatic tumors demonstrates potential for predicting hepatic versus pulmonary metastatic patterns. The high negative predictive value for lung metastases may support clinical decision-making. External validation is essential before clinical implementation. These findings from a single-center study require confirmation in larger, multicenter cohorts.

CT Classification Abdominal Retrospective Clinical In Silico Academic Lab

Machine learning of whole-brain resting-state fMRI signatures for individualized grading of frontal gliomas.

Hu Y, Cao X, Chen H, Geng D, Lv K

•papers•Aug 4 2025

Accurate preoperative grading of gliomas is critical for therapeutic planning and prognostic evaluation. We developed a noninvasive machine learning model leveraging whole-brain resting-state functional magnetic resonance imaging (rs-fMRI) biomarkers to discriminate high-grade (HGGs) and low-grade gliomas (LGGs) in the frontal lobe. This retrospective study included 138 patients (78 LGGs, 60 HGGs) with left frontal gliomas. A total of 7134 features were extracted from the mean amplitude of low-frequency fluctuation (mALFF), mean fractional ALFF, mean percentage amplitude of fluctuation (mPerAF), mean regional homogeneity (mReHo) maps and resting-state functional connectivity (RSFC) matrix. Twelve predictive features were selected through Mann-Whitney U test, correlation analysis and least absolute shrinkage and selection operator method. The patients were stratified and randomized into the training and testing datasets with a 7:3 ratio. The logical regression, random forest, support vector machine (SVM) and adaptive boosting algorithms were used to establish models. The model performance was evaluated using area under the receiver operating characteristic curve, accuracy, sensitivity, and specificity. The selected 12 features included 7 RSFC features, 4 mPerAF features, and 1 mReHo feature. Based on these features, the model was established using the SVM had an optimal performance. The accuracy in the training and testing datasets was 0.957 and 0.727, respectively. The area under the receiver operating characteristic curves was 0.972 and 0.799, respectively. Our whole-brain rs-fMRI radiomics approach provides an objective tool for preoperative glioma stratification. The biological interpretability of selected features reflects distinct neuroplasticity patterns between LGGs and HGGs, advancing understanding of glioma-network interactions.

MRI Classification Neurological Retrospective Clinical In Silico Academic Lab

Enhanced detection of ovarian cancer using AI-optimized 3D CNNs for PET/CT scan analysis.

Sadeghi MH, Sina S, Faghihi R, Alavi M, Giammarile F, Omidi H

•papers•Aug 4 2025

This study investigates how deep learning (DL) can enhance ovarian cancer diagnosis and staging using large imaging datasets. Specifically, we compare six conventional convolutional neural network (CNN) architectures-ResNet, DenseNet, GoogLeNet, U-Net, VGG, and AlexNet-with OCDA-Net, an enhanced model designed for [<sup>18</sup>F]FDG PET image analysis. The OCDA-Net, an advancement on the ResNet architecture, was thoroughly compared using randomly split datasets of training (80%), validation (10%), and test (10%) images. Trained over 100 epochs, OCDA-Net achieved superior diagnostic classification with an accuracy of 92%, and staging results of 94%, supported by robust precision, recall, and F-measure metrics. Grad-CAM ++ heat-maps confirmed that the network attends to hyper-metabolic lesions, supporting clinical interpretability. Our findings show that OCDA-Net outperforms existing CNN models and has strong potential to transform ovarian cancer diagnosis and staging. The study suggests that implementing these DL models in clinical practice could ultimately improve patient prognoses. Future research should expand datasets, enhance model interpretability, and validate these models in clinical settings.

Mixed Modality Classification Abdominal Methodology In Silico Academic Lab Benchmark SOTA

Filter Papers

Tags

Machine Learning and MRI-Based Whole-Organ Magnetic Resonance Imaging Score (WORMS): A Novel Approach to Enhancing Genicular Artery Embolization Outcomes in Knee Osteoarthritis.

The Use of Artificial Intelligence to Improve Detection of Acute Incidental Pulmonary Emboli.

Do Edges Matter? Investigating Edge-Enhanced Pre-Training for Medical Image Segmentation

S-RRG-Bench: Structured Radiology Report Generation with Fine-Grained Evaluation Framework

Development and Validation of an Explainable MRI-Based Habitat Radiomics Model for Predicting p53-Abnormal Endometrial Cancer: A Multicentre Feasibility Study.

A Novel Dual-Output Deep Learning Model Based on InceptionV3 for Radiographic Bone Age and Gender Assessment.

Open-radiomics: a collection of standardized datasets and a technical protocol for reproducible radiomics machine learning pipelines.

Can Machine Learning Predict Metastatic Sites in Pancreatic Ductal Adenocarcinoma? A Radiomic Analysis.

Machine learning of whole-brain resting-state fMRI signatures for individualized grading of frontal gliomas.

Enhanced detection of ovarian cancer using AI-optimized 3D CNNs for PET/CT scan analysis.

Ready to Sharpen Your Edge?