Latest Papers on Radiology AI. Tags: Abdominal

Deep Learning for Automated Measures of SUV and Molecular Tumor Volume in [68Ga]PSMA-11 or [18F]DCFPyL, [18F]FDG, and [177Lu]Lu-PSMA-617 Imaging with Global Threshold Regional Consensus Network.

Jackson P, Buteau JP, McIntosh L, Sun Y, Kashyap R, Casanueva S, Ravi Kumar AS, Sandhu S, Azad AA, Alipour R, Saghebi J, Kong G, Jewell K, Eifer M, Bollampally N, Hofman MS

•papers•Sep 18 2025

Metastatic castration-resistant prostate cancer has a high rate of mortality with a limited number of effective treatments after hormone therapy. Radiopharmaceutical therapy with [177Lu]Lu-prostate-specific membrane antigen-617 (LuPSMA) is one treatment option; however, response varies and is partly predicted by PSMA expression and metabolic activity, assessed on [68Ga]PSMA-11 or [18F]DCFPyL and [18F]FDG PET, respectively. Automated methods to measure these on PET imaging have previously yielded modest accuracy. Refining computational workflows and standardizing approaches may improve patient selection and prognostication for LuPSMA therapy. Methods: PET/CT and quantitative SPECT/CT images from an institutional cohort of patients staged for LuPSMA therapy were annotated for total disease burden. In total, 676 [68Ga]PSMA-11 or [18F]DCFPyL PET, 390 [18F]FDG PET, and 477 LuPSMA SPECT images were used for development of automated workflow and tested on 56 cases with externally referred PET/CT staging. A segmentation framework, the Global Threshold Regional Consensus Network, was developed based on nnU-Net, with processing refinements to improve boundary definition and overall label accuracy. Results: Using the model to contour disease extent, the mean volumetric Dice similarity coefficient for [68Ga]PSMA-11 or [18F]DCFPyL PET was 0.94, for [18F]FDG PET was 0.84, and for LuPSMA SPECT was 0.97. On external test cases, Dice accuracy was 0.95 and 0.84 on PSMA and FDG PET, respectively. The refined models yielded consistent improvements compared with nnU-Net, with an increase of 3%-5% in Dice accuracy and 10%-17% in surface agreement. Quantitative biomarkers were compared with a human-defined ground truth using the Pearson coefficient, with scores for [68Ga]PSMA-11 or [18F]DCFPyL, [18F]FDG, and LuPSMA, respectively, of 0.98, 0.94, and 0.99 for disease volume; 0.98, 0.88, and 0.99 for SUVmean; 0.96, 0.91, and 0.99 for SUVmax; and 0.97, 0.96, and 0.99 for volume intensity product. Conclusion: Delineation of disease extent and tracer avidity can be performed with a high degree of accuracy using automated deep learning methods. By incorporating threshold-based postprocessing, the tools can closely match the output of manual workflows. Pretrained models and scripts to adapt to institutional data are provided for open use.

Mixed Modality Segmentation Abdominal Methodology In Silico Academic Lab Open Code

Automating classification of treatment responses to combined targeted therapy and immunotherapy in HCC.

Quan B, Dai M, Zhang P, Chen S, Cai J, Shao Y, Xu P, Li P, Yu L

•papers•Sep 17 2025

Tyrosine kinase inhibitors (TKIs) combined with immunotherapy regimens are now widely used for treating advanced hepatocellular carcinoma (HCC), but their clinical efficacy is limited to a subset of patients. Considering that the vast majority of advanced HCC patients lose the opportunity for liver resection and thus cannot provide tumor tissue samples, we leveraged the clinical and image data to construct a multimodal convolutional neural network (CNN)-Transformer model for predicting and analyzing tumor response to TKI-immunotherapy. An automatic liver tumor segmentation system, based on a two-stage 3D U-Net framework, delineates lesions by first segmenting the liver parenchyma and then precisely localizing the tumor. This approach effectively addresses the variability in clinical data and significantly reduces bias introduced by manual intervention. Thus, we developed a clinical model using only pre-treatment clinical information, a CNN model using only pre-treatment magnetic resonance imaging data, and an advanced multimodal CNN-Transformer model that fused imaging and clinical parameters using a training cohort (n = 181) and then validated them using an independent cohort (n = 30). In the validation cohort, the area under the curve (95% confidence interval) values were 0.720 (0.710-0.731), 0.695 (0.683-0.707), and 0.785 (0.760-0.810), respectively, indicating that the multimodal model significantly outperformed the single-modality baseline models across validations. Finally, single-cell sequencing with the surgical tumor specimens reveals tumor ecosystem diversity associated with treatment response, providing a preliminary biological validation for the prediction model. In summary, this multimodal model effectively integrates imaging and clinical features of HCC patients, has a superior performance in predicting tumor response to TKI-immunotherapy, and provides a reliable tool for optimizing personalized treatment strategies.

MRI Classification Abdominal Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Habitat-aware radiomics and adaptive 2.5D deep learning predict treatment response and long-term survival in ESCC patients undergoing neoadjuvant chemoimmunotherapy.

Gao X, Yang L, She T, Wang F, Ding H, Lu Y, Xu Y, Wang Y, Li P, Duan X, Leng X

•papers•Sep 17 2025

Current radiomic approaches inadequately resolve spatial intratumoral heterogeneity (ITH) in esophageal squamous cell carcinoma (ESCC), limiting neoadjuvant chemoimmunotherapy (NACI) response prediction. We propose an interpretable multimodal framework to: (1) quantitatively map intra-/peritumoral heterogeneity via voxel-wise habitat radiomics; (2) model cross-sectional tumor biology using 2.5D deep learning; and (3) establish mechanism-driven biomarkers via SHAP interpretability to identify resistance-linked subregions. This dual-center retrospective study analyzed 269 treatment-naïve ESCC patients with baseline PET/CT (training: n = 144; validation: n = 62; test: n = 63). Habitat radiomics delineated tumor subregions via K-means clustering (Calinski-Harabasz-optimized) on PET/CT, extracting 1,834 radiomic features per modality. A multi-stage pipeline (univariate filtering, mRMR, LASSO regression) selected 32 discriminative features. The 2.5D model aggregated ± 4 peri-tumoral slices, fusing PET/CT via MixUp channels using a fine-tuned ResNet50 (ImageNet-pretrained), with multi-instance learning (MIL) translating slice-level features to patient-level predictions. Habitat features, MIL signatures, and clinical variables were integrated via five-classifier ensemble (ExtraTrees/SVM/RandomForest) and Crossformer architecture (SMOTE-balanced). Validation included AUC, sensitivity, specificity, calibration curves, decision curve analysis (DCA), survival metrics (C-index, Kaplan-Meier), and interpretability (SHAP, Grad-CAM). Habitat radiomics achieved superior validation AUC (0.865, 95% CI: 0.778-0.953), outperforming conventional radiomics (ΔAUC + 3.6%, P < 0.01) and clinical models (ΔAUC + 6.4%, P < 0.001). SHAP identified the invasive front (H2) as dominant predictor (40% of top features), with wavelet_LHH_firstorder_Entropy showing highest impact (SHAP = + 0.42). The 2.5D MIL model demonstrated strong generalizability (validation AUC: 0.861). The combined model achieved state-of-the-art test performance (AUC = 0.824, sensitivity = 0.875) with superior calibration (Hosmer-Lemeshow P > 0.800), effective survival stratification (test C-index: 0.809), and 23-41% net benefit improvement in DCA. Integrating habitat radiomics and 2.5D deep learning enables interpretable dual diagnostic-prognostic stratification in ESCC, advancing precision oncology by decoding spatial heterogeneity.

Mixed Modality Classification Abdominal Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Accuracy of Foundation AI Models for Hepatic Macrovesicular Steatosis Quantification in Frozen Sections

Koga, S., Guda, A., Wang, Y., Sahni, A., Wu, J., Rosen, A., Nield, J., Nandish, N., Patel, K., Goldman, H., Rajapakse, C., Walle, S., Kristen, S., Tondon, R., Alipour, Z.

•preprint•Sep 17 2025

IntroductionAccurate intraoperative assessment of macrovesicular steatosis in donor liver biopsies is critical for transplantation decisions but is often limited by inter-observer variability and freezing artifacts that can obscure histological details. Artificial intelligence (AI) offers a potential solution for standardized and reproducible evaluation. To evaluate the diagnostic performance of two self-supervised learning (SSL)-based foundation models, Prov-GigaPath and UNI, for classifying macrovesicular steatosis in frozen liver biopsy sections, compared with assessments by surgical pathologists. MethodsWe retrospectively analyzed 131 frozen liver biopsy specimens from 68 donors collected between November 2022 and September 2024. Slides were digitized into whole-slide images, tiled into patches, and used to extract embeddings with Prov-GigaPath and UNI; slide-level classifiers were then trained and tested. Intraoperative diagnoses by on-call surgical pathologists were compared with ground truth determined from independent reviews of permanent sections by two liver pathologists. Accuracy was evaluated for both five-category classification and a clinically significant binary threshold (<30% vs. [≥]30%). ResultsFor binary classification, Prov-GigaPath achieved 96.4% accuracy, UNI 85.7%, and surgical pathologists 84.0% (P = .22). In five-category classification, accuracies were lower: Prov-GigaPath 57.1%, UNI 50.0%, and pathologists 58.7% (P = .70). Misclassification primarily occurred in intermediate categories (5%-<30% steatosis). ConclusionsSSL-based foundation models performed comparably to surgical pathologists in classifying macrovesicular steatosis, at the clinically relevant <30% vs. [≥]30% threshold. These findings support the potential role of AI in standardizing intraoperative evaluation of donor liver biopsies; however, the small sample size limits generalizability and requires validation in larger, balanced cohorts.

Mixed Modality Classification Abdominal Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Non-iterative and uncertainty-aware MRI-based liver fat estimation using an unsupervised deep learning method.

Meneses JP, Tejos C, Makalic E, Uribe S

•papers•Sep 17 2025

Liver proton density fat fraction (PDFF), the ratio between fat-only and overall proton densities, is an extensively validated biomarker associated with several diseases. In recent years, numerous deep learning-based methods for estimating PDFF have been proposed to optimize acquisition and post-processing times without sacrificing accuracy, compared to conventional methods. However, the lack of interpretability and the often poor generalizability of these DL-based models undermine the adoption of such techniques in clinical practice. In this work, we propose an Artificial Intelligence-based Decomposition of water and fat with Echo Asymmetry and Least-squares (AI-DEAL) method, designed to estimate both proton density fat fraction (PDFF) and the associated uncertainty maps. Once trained, AI-DEAL performs a one-shot MRI water-fat separation by first calculating the nonlinear confounder variables, R2∗ and off-resonance field. It then employs a weighted least squares approach to compute water-only and fat-only signals, along with their corresponding covariance matrix, which are subsequently used to derive the PDFF and its associated uncertainty. We validated our method using in vivo liver CSE-MRI, a fat-water phantom, and a numerical phantom. AI-DEAL demonstrated PDFF biases of 0.25% and -0.12% at two liver ROIs, outperforming state-of-the-art deep learning-based techniques. Although trained using in vivo data, our method exhibited PDFF biases of -3.43% in the fat-water phantom and -0.22% in the numerical phantom with no added noise. The latter bias remained approximately constant when noise was introduced. Furthermore, the estimated uncertainties showed good agreement with the observed errors and the variations within each ROI, highlighting their potential value for assessing the reliability of the resulting PDFF maps.

MRI Segmentation Abdominal Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Multimodal deep learning integration for predicting renal function outcomes in living donor kidney transplantation: a retrospective cohort study.

Kim JM, Jung H, Kwon HE, Ko Y, Jung JH, Shin S, Kim YH, Kim YH, Jun TJ, Kwon H

•papers•Sep 17 2025

Accurately predicting post-transplant renal function is essential for optimizing donor-recipient matching and improving long-term outcomes in kidney transplantation (KT). Traditional models using only structured clinical data often fail to account for complex biological and anatomical factors. This study aimed to develop and validate a multimodal deep learning model that integrates computed tomography (CT) imaging, radiology report text, and structured clinical variables to predict 1-year estimated glomerular filtration rate (eGFR) in living donor kidney transplantation (LDKT) recipients. A retrospective cohort of 1,937 LDKT recipients was selected from 3,772 KT cases. Exclusions included deceased donor KT, immunologic high-risk recipients (n = 304), missing CT imaging, early graft complications, and anatomical abnormalities. eGFR at 1 year post-transplant was classified into four categories: > 90, 75-90, 60-75, and 45-60 mL/min/1.73 m2. Radiology reports were embedded using BioBERT, while CT videos were encoded using a CLIP-based visual extractor. These were fused with structured clinical features and input into ensemble classifiers including XGBoost. Model performance was evaluated using cross-validation and SHapley Additive exPlanations (SHAP) analysis. The full multimodal model achieved a macro F1 score of 0.675, micro F1 score of 0.704, and weighted F1 score of 0.698-substantially outperforming the clinical-only model (macro F1 = 0.292). CT imaging contributed more than text data (clinical + CT macro F1 = 0.651; clinical + text = 0.486). The model showed highest accuracy in the >90 (F1 = 0.7773) and 60-75 (F1 = 0.7303) categories. SHAP analysis identified donor age, BMI, and donor sex as key predictors. Dimensionality reduction confirmed internal feature validity. Multimodal deep learning integrating clinical, imaging, and textual data enhances prediction of post-transplant renal function. This framework offers a robust and interpretable approach for individualized risk stratification in LDKT, supporting precision medicine in transplantation.

CT Classification Abdominal Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Augmenting conventional criteria: a CT-based deep learning radiomics nomogram for early recurrence risk stratification in hepatocellular carcinoma after liver transplantation.

Wu Z, Liu D, Ouyang S, Hu J, Ding J, Guo Q, Gao J, Luo J, Ren K

•papers•Sep 17 2025

We developed a deep learning radiomics nomogram (DLRN) using CT scans to improve clinical decision-making and risk stratification for early recurrence of hepatocellular carcinoma (HCC) after transplantation, which typically has a poor prognosis. In this two-center study, 245 HCC patients who had contrast-enhanced CT before liver transplantation were split into a training set (n = 184) and a validation set (n = 61). We extracted radiomics and deep learning features from tumor and peritumor areas on preoperative CT images. The DLRN was created by combining these features with significant clinical variables using multivariate logistic regression. Its performance was validated against four traditional risk criteria to assess its additional value. The DLRN model showed strong predictive accuracy for early HCC recurrence post-transplant, with AUCs of 0.884 and 0.829 in training and validation groups. High DLRN scores significantly increased relapse risk by 16.370 times (95% CI: 7.100-31.690; p < 0.001). Combining DLRN with Metro-Ticket 2.0 criteria yielded the best prediction (AUC: training/validation: 0.936/0.863). The CT-based DLRN offers a non-invasive method for predicting early recurrence following liver transplantation in patients with HCC. Furthermore, it provides substantial additional predictive value with traditional prognostic scoring systems. AI-driven predictive models utilizing preoperative CT imaging enable accurate identification of early HCC recurrence risk following liver transplantation, facilitating risk-stratified surveillance protocols and optimized post-transplant management. A CT-based DLRN for predicting early HCC recurrence post-transplant was developed. The DLRN predicted recurrence with high accuracy (AUC: 0.829) and 16.370-fold increased recurrence risk. Combining DLRN with Metro-Ticket 2.0 criteria achieved optimal prediction (AUC: 0.863).

CT Classification Abdominal Retrospective Clinical In Silico Academic Lab

AI-powered insights in pediatric nephrology: current applications and future opportunities.

Nada A, Ahmed Y, Hu J, Weidemann D, Gorman GH, Lecea EG, Sandokji IA, Cha S, Shin S, Bani-Hani S, Mannemuddhu SS, Ruebner RL, Kakajiwala A, Raina R, George R, Elchaki R, Moritz ML

•papers•Sep 16 2025

Artificial intelligence (AI) is rapidly emerging as a transformative force in pediatric nephrology, enabling improvements in diagnostic accuracy, therapeutic precision, and operational workflows. By integrating diverse datasets-including patient histories, genomics, imaging, and longitudinal clinical records-AI-driven tools can detect subtle kidney anomalies, predict acute kidney injury, and forecast disease progression. Deep learning models, for instance, have demonstrated the potential to enhance ultrasound interpretations, refine kidney biopsy assessments, and streamline pathology evaluations. Coupled with robust decision support systems, these innovations also optimize medication dosing and dialysis regimens, ultimately improving patient outcomes. AI-powered chatbots hold promise for improving patient engagement and adherence, while AI-assisted documentation solutions offer relief from administrative burdens, mitigating physician burnout. However, ethical and practical challenges remain. Healthcare professionals must receive adequate training to harness AI's capabilities, ensuring that such technologies bolster rather than erode the vital doctor-patient relationship. Safeguarding data privacy, minimizing algorithmic bias, and establishing standardized regulatory frameworks are critical for safe deployment. Beyond clinical care, AI can accelerate pediatric nephrology research by identifying biomarkers, enabling more precise patient recruitment, and uncovering novel therapeutic targets. As these tools evolve, interdisciplinary collaborations and ongoing oversight will be key to integrating AI responsibly. Harnessing AI's vast potential could revolutionize pediatric nephrology, championing a future of individualized, proactive, and empathetic care for children with kidney diseases. Through strategic collaboration and transparent development, these advanced technologies promise to minimize disparities, foster innovation, and sustain compassionate patient-centered care, shaping a new horizon in pediatric nephrology research and practice.

Ultrasound Classification Abdominal Review Prototype Academic Lab Ethics

Concurrent AI assistance with LI-RADS classification for contrast enhanced MRI of focal hepatic nodules: a multi-reader, multi-case study.

Qin X, Huang L, Wei Y, Li H, Wu Y, Zhong J, Jian M, Zhang J, Zheng Z, Xu Y, Yan C

•papers•Sep 16 2025

The Liver Imaging Reporting and Data System (LI-RADS) assessment is subject to inter-reader variability. The present study aimed to evaluate the impact of an artificial intelligence (AI) system on the accuracy and inter-reader agreement of LI-RADS classification based on contrast-enhanced magnetic resonance imaging among radiologists with varying experience levels. This single-center, multi-reader, multi-case retrospective study included 120 patients with 200 focal liver lesions who underwent abdominal contrast-enhanced magnetic resonance imaging examinations between June 2023 and May 2024. Five radiologists with different experience levels independently assessed LI-RADS classification and imaging features with and without AI assistance. The reference standard was established by consensus between two expert radiologists. Accuracy was used to measure the performance of AI systems and radiologists. Kappa or intraclass correlation coefficient was utilized to estimate inter-reader agreement. The LI-RADS categories were as follows: 33.5% of LR-3 (67/200), 29.0% of LR-4 (58/200), 33.5% of LR-5 (67/200), and 4.0% of LR-M (8/200) cases. The AI system significantly improved the overall accuracy of LI-RADS classification from 69.9 to 80.1% (p < 0.001), with the most notable improvement among junior radiologists from 65.7 to 79.7% (p < 0.001). Inter-reader agreement for LI-RADS classification was significantly higher with AI assistance compared to that without (weighted Cohen's kappa, 0.655 vs. 0.812, p < 0.001). The AI system also enhanced the accuracy and inter-reader agreement for imaging features, including non-rim arterial phase hyperenhancement, non-peripheral washout, and restricted diffusion. Additionally, inter-reader agreement for lesion size measurements improved, with intraclass correlation coefficient changing from 0.857 to 0.951 (p < 0.001). The AI system significantly increases accuracy and inter-reader agreement of LI-RADS 3/4/5/M classification, particularly benefiting junior radiologists.

MRI Classification Abdominal Retrospective Clinical In Silico Academic Lab

Machine and deep learning for MRI-based quantification of liver iron overload: a systematic review and meta-analysis.

Elhaie M, Koozari A, Alshammari QT

•papers•Sep 16 2025

Liver iron overload, associated with conditions such as hereditary hemochromatosis and β‑thalassemia major, requires accurate quantification of liver iron concentration (LIC) to guide timely interventions and prevent complications. Magnetic resonance imaging (MRI) is the gold standard for noninvasive LIC assessment, but challenges in protocol variability and diagnostic consistency persist. Machine learning (ML) and deep learning (DL) offer potential to enhance MRI-based LIC quantification, yet their efficacy remains underexplored. This systematic review and meta-analysis evaluates the diagnostic accuracy, algorithmic performance, and clinical applicability of ML and DL techniques for MRI-based LIC quantification in liver iron overload, adhering to PRISMA guidelines. A comprehensive search across PubMed, Embase, Scopus, Web of Science, Cochrane Library, and IEEE Xplore identified studies applying ML/DL to MRI-based LIC quantification. Eligible studies were assessed for diagnostic accuracy (sensitivity, specificity, AUC), LIC quantification precision (correlation, mean absolute error), and clinical applicability (automation, processing time). Methodological quality was evaluated using the QUADAS‑2 tool, with qualitative synthesis and meta-analysis where feasible. Eight studies were included, employing algorithms such as convolutional neural networks (CNNs), radiomics, and fuzzy C‑mean clustering on T2*-weighted and multiparametric MRI. Pooled diagnostic accuracy from three studies showed a sensitivity of 0.79 (95% CI: 0.66-0.88) and specificity of 0.77 (95% CI: 0.64-0.86), with an AUC of 0.84. The DL methods demonstrated high precision (e.g., Pearson's r = 0.999) and automation, reducing processing times to as low as 0.1 s/slice. Limitations included heterogeneity, limited generalizability, and small external validation sets. Both ML and DL enhance MRI-based LIC quantification, offering high accuracy and efficiency. Standardized protocols and multicenter validation are needed to ensure clinical scalability and equitable access.

MRI Registration Abdominal Meta Analysis In Silico Academic Lab Benchmark SOTA

Filter Papers

Tags

Deep Learning for Automated Measures of SUV and Molecular Tumor Volume in [<sup>68</sup>Ga]PSMA-11 or [<sup>18</sup>F]DCFPyL, [<sup>18</sup>F]FDG, and [<sup>177</sup>Lu]Lu-PSMA-617 Imaging with Global Threshold Regional Consensus Network.

Automating classification of treatment responses to combined targeted therapy and immunotherapy in HCC.

Habitat-aware radiomics and adaptive 2.5D deep learning predict treatment response and long-term survival in ESCC patients undergoing neoadjuvant chemoimmunotherapy.

Accuracy of Foundation AI Models for Hepatic Macrovesicular Steatosis Quantification in Frozen Sections

Non-iterative and uncertainty-aware MRI-based liver fat estimation using an unsupervised deep learning method.

Multimodal deep learning integration for predicting renal function outcomes in living donor kidney transplantation: a retrospective cohort study.

Augmenting conventional criteria: a CT-based deep learning radiomics nomogram for early recurrence risk stratification in hepatocellular carcinoma after liver transplantation.

AI-powered insights in pediatric nephrology: current applications and future opportunities.

Concurrent AI assistance with LI-RADS classification for contrast enhanced MRI of focal hepatic nodules: a multi-reader, multi-case study.

Machine and deep learning for MRI-based quantification of liver iron overload: a systematic review and meta-analysis.

Ready to Sharpen Your Edge?