Latest Papers on Radiology AI. Tags: Abdominal

A multi-stage training and deep supervision based segmentation approach for 3D abdominal multi-organ segmentation.

Wu P, An P, Zhao Z, Guo R, Ma X, Qu Y, Xu Y, Yu H

•papers•Jul 17 2025

Accurate X-ray Computed tomography (CT) image segmentation of the abdominal organs is fundamental for diagnosing abdominal diseases, planning cancer treatment, and formulating radiotherapy strategies. However, the existing deep learning based models for three-dimensional (3D) CT image abdominal multi-organ segmentation face challenges, including complex organ distribution, scarcity of labeled data, and diversity of organ structures, leading to difficulties in model training and convergence and low segmentation accuracy. To address these issues, a novel multi-stage training and a deep supervision model based segmentation approach is proposed. It primary integrates multi-stage training, pseudo- labeling technique, and a developed deep supervision model with attention mechanism (DLAU-Net), specifically designed for 3D abdominal multi-organ segmentation. The DLAU-Net enhances segmentation performance and model adaptability through an improved network architecture. The multi-stage training strategy accelerates model convergence and enhances generalizability, effectively addressing the diversity of abdominal organ structures. The introduction of pseudo-labeling training alleviates the bottleneck of labeled data scarcity and further improves the model's generalization performance and training efficiency. Experiments were conducted on a large dataset provided by the FLARE 2023 Challenge. Comprehensive ablation studies and comparative experiments were conducted to validate the effectiveness of the proposed method. Our method achieves an average organ accuracy (AVG) of 90.5% and a Dice Similarity Coefficient (DSC) of 89.05% and exhibits exceptional performance in terms of training speed and handling data diversity, particularly in the segmentation tasks of critical abdominal organs such as the liver, spleen, and kidneys, significantly outperforming existing comparative methods.

CT Segmentation Abdominal Methodology In Silico

Deep learning models for deriving optimised measures of fat and muscle mass from MRI.

Thomas B, Ali MA, Ali FMH, Chung A, Joshi M, Maiguma-Wilson S, Reiff G, Said H, Zalmay P, Berks M, Blackledge MD, O'Connor JPB

•papers•Jul 17 2025

Fat and muscle mass are potential biomarkers of wellbeing and disease in oncology, but clinical measurement methods vary considerably. Here we evaluate the accuracy, precision and ability to track change for multiple deep learning (DL) models that quantify fat and muscle mass from abdominal MRI. Specifically, subcutaneous fat (SF), intra-abdominal fat (VF), external muscle (EM) and psoas muscle (PM) were evaluated using 15 convolutional neural network (CNN)-based and 4 transformer-based deep learning model architectures. There was negligible difference in the accuracy of human observers and all deep learning models in delineating SF or EM. Both of these tissues had excellent repeatability of their delineation. VF was measured most accurately by the human observers, then by CNN-based models, which outperformed transformer-based models. In distinction, PM delineation accuracy and repeatability was poor for all assessments. Repeatability limits of agreement determined when changes measured in individual patients were due to real change rather than test-retest variation. In summary, DL model accuracy and precision of delineating fat and muscle volumes varies between CNN-based and transformer-based models, between different tissues and in some cases with gender. These factors should be considered when investigators deploy deep learning methods to estimate biomarkers of fat and muscle mass.

MRI Segmentation Abdominal Methodology In Silico Academic Lab

Opportunistic computed tomography (CT) assessment of osteoporosis in patients undergoing transcatheter aortic valve replacement (TAVR).

Paukovitsch M, Fechner T, Felbel D, Moerike J, Rottbauer W, Klömpken S, Brunner H, Kloth C, Beer M, Sekuboyina A, Buckert D, Kirschke JS, Sollmann N

•papers•Jul 17 2025

CT-based opportunistic screening using artificial intelligence finds a high prevalence (43%) of osteoporosis in CT scans obtained for planning of transcatheter aortic valve replacement. Thus, opportunistic screening may be a cost-effective way to assess osteoporosis in high-risk populations. Osteoporosis is an underdiagnosed condition associated with fractures and frailty, but may be detected in routine computed tomography (CT) scans. Volumetric bone mineral density (vBMD) was measured in clinical routine thoraco-abdominal CT scans of 207 patients for planning of transcatheter aortic valve replacement (TAVR) using an artificial intelligence (AI)-based algorithm. 43% of patients had osteoporosis (vBMD < 80 mg/cm3 L1-L3) and were elderly (83.0 {interquartile range [IQR]: 78.0-85.5} vs. 79.0 {IQR: 71.8-84.0} years, p < 0.001), more often female (55.1 vs. 28.8%, p < 0.001), and had a higher Society of Thoracic Surgeon's score for mortality (3.0 {IQR:1.8-4.6} vs. 2.1 {IQR: 1.4-3.2}%, p < 0.001). In addition to lumbar vBMD (58.2 ± 14.7 vs. 106 ± 21.4 mg/cm3, p < 0.001), thoracic vBMD (79.5 ± 17.9 vs. 127.4 ± 26.0 mg/cm3, p < 0.001) was also significantly reduced in these patients and showed high diagnostic accuracy for osteoporosis assessment (area under curve: 0.96, p < 0.001). Osteoporotic patients were significantly more often at risk for falls (40.4 vs. 22.9%, p = 0.007) and required help in activities of daily life (ADL) more frequently (48.3 vs. 33.1%, p = 0.026), while direct-to-home discharges were fewer (88.8 vs. 96.6%, p = 0.026). In-hospital bleeding complications (3.4 vs. 5.1%), stroke (1.1 vs. 2.5%), and death (1.1 vs. 0.8%) were equally low, while in-hospital device success was equally high (94.4 vs. 94.9%, p > 0.05 for all comparisons). However, one-year probability of survival was significantly lower (84.0 vs. 98.2%, log-rank p < 0.01). Applying an AI-based algorithm to TAVR planning CT scans can reveal a high rate of 43% patients having osteoporosis. Osteoporosis may represent a marker related to frailty and worsened outcome in TAVR patients.

CT Classification Abdominal Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Predicting ADC map quality from T2-weighted MRI: A deep learning approach for early quality assessment to assist point-of-care.

Brender JR, Ota M, Nguyen N, Ford JW, Kishimoto S, Harmon SA, Wood BJ, Pinto PA, Krishna MC, Choyke PL, Turkbey B

•papers•Jul 17 2025

Poor quality prostate MRI images compromise diagnostic accuracy, with diffusion-weighted imaging and the resulting apparent diffusion coefficient (ADC) maps being particularly vulnerable. These maps are critical for prostate cancer diagnosis, yet current methods relying on standardizing technical parameters fail to consistently ensure image quality. We propose a novel deep learning approach to predict low-quality ADC maps using T2-weighted (T2W) images, enabling real-time corrective interventions during imaging. A multi-site dataset of T2W images and ADC maps from 486 patients, spanning 62 external clinics and in-house imaging, was retrospectively analyzed. A neural network was trained to classify ADC map quality as "diagnostic" or "non-diagnostic" based solely on T2W images. Rectal cross-sectional area measurements were evaluated as an interpretable metric for susceptibility-induced distortions. Analysis revealed limited correlation between individual acquisition parameters and image quality, with horizontal phase encoding significant for T2 imaging (p < 0.001, AUC = 0.6735) and vertical resolution for ADC maps (p = 0.006, AUC = 0.6348). By contrast, the neural network achieved robust performance for ADC map quality prediction from T2 images, with 83 % sensitivity and 90 % negative predictive value in multicenter validation, comparable to single-site models using ADC maps directly. Remarkably, it generalized well to unseen in-house data (94 ± 2 % accuracy). Rectal cross-sectional area correlated with ADC quality (AUC = 0.65), offering a simple, interpretable metric. The probability of low quality, uninterpretable ADC maps can be inferred early in the imaging process by a neural network approach, allowing corrective action to be employed.

MRI Classification Abdominal Retrospective Clinical In Silico Academic Lab

Large Language Model-Based Entity Extraction Reliably Classifies Pancreatic Cysts and Reveals Predictors of Malignancy: A Cross-Sectional and Retrospective Cohort Study

Papale, A. J., Flattau, R., Vithlani, N., Mahajan, D., Ziemba, Y., Zavadsky, T., Carvino, A., King, D., Nadella, S.

•preprint•Jul 17 2025

Pancreatic cystic lesions (PCLs) are often discovered incidentally on imaging and may progress to pancreatic ductal adenocarcinoma (PDAC). PCLs have a high incidence in the general population, and adherence to screening guidelines can be variable. With the advent of technologies that enable automated text classification, we sought to evaluate various natural language processing (NLP) tools including large language models (LLMs) for identifying and classifying PCLs from radiology reports. We correlated our classification of PCLs to clinical features to identify risk factors for a positive PDAC biopsy. We contrasted a previously described NLP classifier to LLMs for prospective identification of PCLs in radiology. We evaluated various LLMs for PCL classification into low-risk or high-risk categories based on published guidelines. We compared prompt-based PCL classification to specific entity-guided PCL classification. To this end, we developed tools to deidentify radiology and track patients longitudinally based on their radiology reports. Additionally, we used our newly developed tools to evaluate a retrospective database of patients who underwent pancreas biopsy to determine associated factors including those in their radiology reports and clinical features using multivariable logistic regression modelling. Of 14,574 prospective radiology reports, 665 (4.6%) described a pancreatic cyst, including 175 (1.2%) high-risk lesions. Our Entity-Extraction Large Language Model tool achieved recall 0.992 (95% confidence interval [CI], 0.985-0.998), precision 0.988 (0.979-0.996), and F1-score 0.990 (0.985-0.995) for detecting cysts; F1-scores were 0.993 (0.987-0.998) for low-risk and 0.977 (0.952-0.995) for high-risk classification. Among 4,285 biopsy patients, 330 had pancreatic cysts documented [≥]6 months before biopsy. In the final multivariable model (AUC = 0.877), independent predictors of adenocarcinoma were change in duct caliber with upstream atrophy (adjusted odds ratio [AOR], 4.94; 95% CI, 1.30-18.79), mural nodules (AOR, 11.02; 1.81-67.26), older age (AOR, 1.10; 1.05-1.16), lower body mass index (AOR, 0.86; 0.76-0.96), and total bilirubin (AOR, 1.81; 1.18-2.77). Automated NLP-based analysis of radiology reports using LLM-driven entity extraction can accurately identify and risk-stratify PCLs and, when retrospectively applied, reveal factors predicting malignant progression. Widespread implementation may improve surveillance and enable earlier intervention.

Mixed Modality Classification Abdominal Retrospective Clinical In Silico Academic Lab GenAI

Evolving techniques in the endoscopic evaluation and management of pancreas cystic lesions.

Maloof T, Karaisz F, Abdelbaki A, Perumal KD, Krishna SG

•papers•Jul 17 2025

Accurate diagnosis of pancreatic cystic lesions (PCLs) is essential to guide appropriate management and reduce unnecessary surgeries. Despite multiple guidelines in PCL management, a substantial proportion of patients still undergo major resections for benign cysts, and a majority of resected intraductal papillary mucinous neoplasms (IPMNs) show only low-grade dysplasia, leading to significant clinical, financial, and psychological burdens. This review highlights emerging endoscopic approaches that enhance diagnostic accuracy and support organ-sparing, minimally invasive management of PCLs. Recent studies suggest that endoscopic ultrasound (EUS) and its accessory techniques, such as contrast-enhanced EUS and needle-based confocal laser endomicroscopy, as well as next-generation sequencing analysis of cyst fluid, not only accurately characterize PCLs but are also well tolerated and cost-effective. Additionally, emerging therapeutics such as EUS-guided radiofrequency ablation (RFA) and EUS-chemoablation are promising as minimally invasive treatments for high-risk mucinous PCLs in patients who are not candidates for surgery. Accurate diagnosis of PCLs remains challenging, leading to many patients undergoing unnecessary surgery. Emerging endoscopic imaging biomarkers, artificial intelligence analysis, and molecular biomarkers enhance diagnostic precision. Additionally, novel endoscopic ablative therapies offer safe, minimally invasive, organ-sparing treatment options, thereby reducing the healthcare resource burdens associated with overtreatment.

Ultrasound Classification Abdominal Review Concept GenAI

Deep Learning-Based Fetal Lung Segmentation from Diffusion-weighted MRI Images and Lung Maturity Evaluation for Fetal Growth Restriction

Zhennan Xiao, Katharine Brudkiewicz, Zhen Yuan, Rosalind Aughwane, Magdalena Sokolska, Joanna Chappell, Trevor Gaunt, Anna L. David, Andrew P. King, Andrew Melbourne

•preprint•Jul 17 2025

Fetal lung maturity is a critical indicator for predicting neonatal outcomes and the need for post-natal intervention, especially for pregnancies affected by fetal growth restriction. Intra-voxel incoherent motion analysis has shown promising results for non-invasive assessment of fetal lung development, but its reliance on manual segmentation is time-consuming, thus limiting its clinical applicability. In this work, we present an automated lung maturity evaluation pipeline for diffusion-weighted magnetic resonance images that consists of a deep learning-based fetal lung segmentation model and a model-fitting lung maturity assessment. A 3D nnU-Net model was trained on manually segmented images selected from the baseline frames of 4D diffusion-weighted MRI scans. The segmentation model demonstrated robust performance, yielding a mean Dice coefficient of 82.14%. Next, voxel-wise model fitting was performed based on both the nnU-Net-predicted and manual lung segmentations to quantify IVIM parameters reflecting tissue microstructure and perfusion. The results suggested no differences between the two. Our work shows that a fully automated pipeline is possible for supporting fetal lung maturity assessment and clinical decision-making.

MRI Segmentation Abdominal Methodology In Silico Academic Lab Reproducibility

The application of super-resolution ultrasound radiomics models in predicting the failure of conservative treatment for ectopic pregnancy.

Zhang M, Sheng J

•papers•Jul 17 2025

Conservative treatment remains a viable option for selected patients with ectopic pregnancy (EP), but failure may lead to rupture and serious complications. Currently, serum β-hCG is the main predictor for treatment outcomes, yet its accuracy is limited. This study aimed to develop and validate a predictive model that integrates radiomic features derived from super-resolution (SR) ultrasound images with clinical biomarkers to improve risk stratification. A total of 228 patients with EP receiving conservative treatment were retrospectively included, with 169 classified as treatment success and 59 as failure. SR images were generated using a deep learning-based generative adversarial network (GAN). Radiomic features were extracted from both normal-resolution (NR) and SR ultrasound images. Features with intraclass correlation coefficient (ICC) ≥ 0.75 were retained after intra- and inter-observer evaluation. Feature selection involved statistical testing and Least Absolute Shrinkage and Selection Operator (LASSO) regression. Random forest algorithms were used to construct NR and SR models. A clinical model based on serum β-hCG was also developed. The Clin-SR model was constructed by fusing SR radiomics with β-hCG values. Model performance was evaluated using area under the curve (AUC), calibration, and decision curve analysis (DCA). An independent temporal validation cohort (n = 40; 20 failures, 20 successes) was used to validation of the nomogram derived from the Clin-SR model. The SR model significantly outperformed the NR model in the test cohort (AUC: 0.791 ± 0.015 vs. 0.629 ± 0.083). In a representative iteration, the Clin-SR fusion model achieved an AUC of 0.870 ± 0.015, with good calibration and net clinical benefit, suggesting reliable performance in predicting conservative treatment failure. In the independent validation cohort, the nomogram demonstrated good generalizability with an AUC of 0.808 and consistent calibration across risk thresholds. Key contributing radiomic features included Gray Level Variance and Voxel Volume, reflecting lesion heterogeneity and size. The Clin-SR model, which integrates deep learning-enhanced SR ultrasound radiomics with serum β-hCG, offers a robust and non-invasive tool for predicting conservative treatment failure in ectopic pregnancy. This multimodal approach enhances early risk stratification and supports personalized clinical decision-making, potentially reducing overtreatment and emergency interventions.

Ultrasound Classification Abdominal Retrospective Clinical In Silico Academic Lab Breakthrough

An AI method to predict pregnancy loss by extracting biological indicators from embryo ultrasound recordings in early pregnancy.

Liu L, Zang Y, Zheng H, Li S, Song Y, Feng X, Zhang X, Li Y, Cao L, Zhou G, Dong T, Huang Q, Pan T, Deng J, Cheng D

•papers•Jul 17 2025

B-ultrasound results are widely used in early pregnancy loss (EPL) prediction, but there are inevitable intra-observer and inter-observer errors in B-ultrasound results especially in early pregnancy, which lead to inconsistent assessment of embryonic status, and thus affect the judgment of EPL. To address this, we need a rapid and accurate model to predict pregnancy loss in the first trimester. This study aimed to construct an artificial intelligence model to automatically extract biometric parameters from ultrasound videos of early embryos and predict pregnancy loss. This can effectively eliminate the measurement error of B-ultrasound results, accurately predict EPL, and provide decision support for doctors with relatively little clinical experience. A total of 630 ultrasound videos from women with early singleton pregnancies of gestational age between 6 and 10 weeks were used for training. A two-stage artificial intelligence model was established. First, some biometric parameters such as gestational sac areas (GSA), yolk sac diameter (YSD), crown rump length (CRL) and fetal heart rate (FHR), were extract from ultrasound videos by a deep neural network named A3F-net, which is a modified neural network based on U-Net designed by ourselves. Then an ensemble learning model predicted pregnancy loss risk based on these features. Dice, IOU and Precision were used to evaluate the measurement results, and sensitivity, AUC etc. were used to evaluate the predict results. The fetal heart rate was compared with those measured by doctors, and the accuracy of results was compared with other AI models. In the biometric features measurement stage, the precision of GSA, YSD and CRL of A3F-net were 98.64%, 96.94% and 92.83%, it was the highest compared to other 2 models. Bland-Altman analysis did not show systematic deviations between doctors and AI. The mean and standard deviation of the mean relative error between doctors and the AI model was 0.060 ± 0.057. In the EPL prediction stage, the ensemble learning models demonstrated excellent performance, with CatBoost being the best-performing model, achieving a precision of 98.0% and an AUC of 0.969 (95% CI: 0.962-0.975). In this study, a hybrid AI model to predict EPL was established. First, a deep neural network automatically measured the biometric parameters from ultrasound video to ensure the consistency and accuracy of the measurements, then a machine learning model predicted EPL risk to support doctors making decisions. The use of our established AI model in EPL prediction has the potential to assist physicians in making more accurate and timely clinical decision in clinical application.

Ultrasound Segmentation Abdominal Retrospective Clinical In Silico Academic Lab

Multimodal Large Language Model With Knowledge Retrieval Using Flowchart Embedding for Forming Follow-Up Recommendations for Pancreatic Cystic Lesions.

Zhu Z, Liu J, Hong CW, Houshmand S, Wang K, Yang Y

•papers•Jul 16 2025

BACKGROUND. The American College of Radiology (ACR) Incidental Findings Committee (IFC) algorithm provides guidance for pancreatic cystic lesion (PCL) management. Its implementation using plain-text large language model (LLM) solutions is challenging given that key components include multimodal data (e.g., figures and tables). OBJECTIVE. The purpose of the study is to evaluate a multimodal LLM approach incorporating knowledge retrieval using flowchart embedding for forming follow-up recommendations for PCL management. METHODS. This retrospective study included patients who underwent abdominal CT or MRI from September 1, 2023, to September 1, 2024, and whose report mentioned a PCL. The reports' Findings sections were inputted to a multimodal LLM (GPT-4o). For task 1 (198 patients: mean age, 69.0 ± 13.0 [SD] years; 110 women, 88 men), the LLM assessed PCL features (presence of PCL, PCL size and location, presence of main pancreatic duct communication, presence of worrisome features or high-risk stigmata) and formed a follow-up recommendation using three knowledge retrieval methods (default knowledge, plain-text retrieval-augmented generation [RAG] from the ACR IFC algorithm PDF document, and flowchart embedding using the LLM's image-to-text conversion for in-context integration of the document's flowcharts and tables). For task 2 (85 patients: mean initial age, 69.2 ± 10.8 years; 48 women, 37 men), an additional relevant prior report was inputted; the LLM assessed for interval PCL change and provided an adjusted follow-up schedule accounting for prior imaging using flowchart embedding. Three radiologists assessed LLM accuracy in task 1 for PCL findings in consensus and follow-up recommendations independently; one radiologist assessed accuracy in task 2. RESULTS. For task 1, the LLM with flowchart embedding had accuracy for PCL features of 98.0-99.0%. The accuracy of the LLM follow-up recommendations based on default knowledge, plain-text RAG, and flowchart embedding for radiologist 1 was 42.4%, 23.7%, and 89.9% (p < .001), respectively; radiologist 2 was 39.9%, 24.2%, and 91.9% (p < .001); and radiologist 3 was 40.9%, 25.3%, and 91.9% (p < .001). For task 2, the LLM using flowchart embedding showed an accuracy for interval PCL change of 96.5% and for adjusted follow-up schedules of 81.2%. CONCLUSION. Multimodal flowchart embedding aided the LLM's automated provision of follow-up recommendations adherent to a clinical guidance document. CLINICAL IMPACT. The framework could be extended to other incidental findings through the use of other clinical guidance documents as the model input.

Mixed Modality LLM Radiology Report Abdominal Retrospective Clinical In Silico Academic Lab GenAI

Filter Papers

Tags

A multi-stage training and deep supervision based segmentation approach for 3D abdominal multi-organ segmentation.

Deep learning models for deriving optimised measures of fat and muscle mass from MRI.

Opportunistic computed tomography (CT) assessment of osteoporosis in patients undergoing transcatheter aortic valve replacement (TAVR).

Predicting ADC map quality from T2-weighted MRI: A deep learning approach for early quality assessment to assist point-of-care.

Large Language Model-Based Entity Extraction Reliably Classifies Pancreatic Cysts and Reveals Predictors of Malignancy: A Cross-Sectional and Retrospective Cohort Study

Evolving techniques in the endoscopic evaluation and management of pancreas cystic lesions.

Deep Learning-Based Fetal Lung Segmentation from Diffusion-weighted MRI Images and Lung Maturity Evaluation for Fetal Growth Restriction

The application of super-resolution ultrasound radiomics models in predicting the failure of conservative treatment for ectopic pregnancy.

An AI method to predict pregnancy loss by extracting biological indicators from embryo ultrasound recordings in early pregnancy.

Multimodal Large Language Model With Knowledge Retrieval Using Flowchart Embedding for Forming Follow-Up Recommendations for Pancreatic Cystic Lesions.

Ready to Sharpen Your Edge?