Latest Papers on Radiology AI.

Leveraging Artificial Intelligence to Transform Thoracic Radiology for Lung Nodules and Lung Cancer: Applications, Challenges, and Future Directions.

Lee G, Cho HH, Jeong DY, Kim JH, Oh YJ, Park SG, Lee HY

•papers•Nov 18 2025

This review traces the historical path of artificial intelligence (AI) methods that have been applied to medical image interpretation. Early AI approaches, which were based on clinical expertise and domain-specific medical knowledge, established the basis for data-driven methods, initiating the radiomics era and leading to the widespread use of deep learning in medical imaging. More recently, transformer architectures-originally developed for natural language processing-have been adapted for medical image analysis. In the first section, we explore the literature on the use of AI, specifically addressing lung nodules and lung cancer. AI has been effective in detecting lung nodules, evaluating their characteristics, and predicting cancer risk, while also addressing technical issues like kernel conversion. In lung cancer, AI has been applied to various clinical needs, including prognosis evaluation, mutation identification, treatment response analysis, operability prediction, treatment-related pneumonitis, and clinical information extraction. In the following section, we explore foundation models, multimodal AI, and a multiomic approach in the field of lung nodules and lung cancer. Finally, as AI models continue to evolve, so too must the approaches for evaluating their real-world utility; thus, we outline relevant methods for evaluating the performance and application of AI in thoracic radiology.

CT Detection Chest Review In Silico Academic Lab GenAI Ethics

The role of interdisciplinary collaboration and artificial intelligence in radiology residency education.

Wang J, Wang L, Yang Z, Zou Q, Liu Y

•papers•Nov 18 2025

BackgroundModern medical education demands refined methods, especially in radiology, where accuracy, speed, and clinical decision-making are critical.PurposeTo evaluate the impact of artificial intelligence (AI)-assisted and interdisciplinary educational interventions on residents' theoretical knowledge, confidence in professional skills, and practical clinical abilities. Assessments were conducted at Kirkpatrick Level 2 (Learning) for knowledge. Level 3 (Behavior) and Level 4 (Results) were not assessed in this study due to logistical constraints.Material and MethodsThe study was conducted between January and June 2024 at three medical centers in Shenzhen, China. A total of 240 residents were randomly assigned to three groups of 80 each: group 1 received standard training; group 2 participated in interdisciplinary seminars; and group 3 engaged in AI-assisted learning activities. The study included three stages: baseline assessment, core educational intervention, and final evaluation. Statistical analyses included Shapiro-Wilk and Kolmogorov-Smirnov tests for normality, followed by ANOVA and Tukey's post hoc tests for group comparisons.ResultsResidents in groups 2 and 3 demonstrated significant improvements across all measured domains. Group 3 (AI-assisted training) showed the greatest gains, with theoretical knowledge increasing by 21.5%, confidence in professional skills by 39.4%, and clinical skill performance by 27.1%. All between-group differences were statistically significant (P <0.01).ConclusionThe findings underscore the benefit of combining technology-driven exercises with collaborative, multispecialty learning to strengthen clinical competence. Future research should examine how such AI-based interventions influence long-term performance and how they can be adapted to different training environments.

Report Generation Prospective Clinical Pilot Academic Lab GenAI

Lymphoma classification with multi-parametric texture analysis of DWI and PET imaging in Hodgkin and non-Hodgkin lymphoma: a pilot study.

Malagi AV, Baidya Kayal E, Kandasamy D, Pushpam D, Khare K, Sharma R, Kumar R, Bakhshi S, Mehndiratta A

•papers•Nov 18 2025

Texture analysis in quantitative IVIM-DKI parameters was investigated for characterizing malignant and benign lymph nodes and distinguishing lymphoma subtypes, specifically Hodgkin lymphoma (HL) and non-Hodgkin lymphoma (NHL). A prospective cohort of twenty-one patients (n = 21) diagnosed with biopsy-proven lymphoma (HL: 13 and NHL: 8) were analyzed. All patients underwent conventional MRI including DWI with 9 b-values (0-2000s/mm2) at 1.5 T and whole-body FDG-PET/CT. IVIM-DKI parameters were estimated using IVIM-DKI model with total-variation (TV) spatial-regularization method (IDTV). Apparent diffusion coefficient (ADC) and standard uptake value (SUV) maps were also calculated. Total 31 of 3D texture features using global and second-order textures were extracted from imaging parameters in the volume-of-interest of malignant and benign lymph nodes. Machine learning linear classifier model was developed for distinguishing between malignant from benign lymph nodes and HL from NHL using textural features and area under receiver operating curve (AUROC) that were evaluated to assess diagnostic accuracy. Texture parameters of neighborhood gray-tone difference matrix (NGTDM) in all IVIM-DKI parameters along with ADC demonstrated excellent diagnostic accuracy showing the highest AUROC of 0.99 (individual highest AUROC by ADC: 0.99; AUROC by all: 0.95-0.99) for distinguishing between malignant and benign lymph nodes. While gray-level co-occurrence matrix (GLCM) and gray-level run-length matrix (GLRLM) features in ADC, diffusion coefficient (D), perfusion coefficient (D*), and perfusion fraction (f) displayed the best AUROC of 0.98 (individual highest AUROC by D: 0.96; AUROC by all: 0.85-0.96) for distinguishing HL from NHL. Texture analysis of IVIM-DKI parameters showed promising diagnostic performance in characterizing HL and NHL. Quantitative IVIM-DKI analysis with TV may have a wide range of applicability for the clinical evaluation of lymphomas.

Mixed Modality Classification Whole Body Prospective Clinical Pilot

Enhanced Fetal Plane Classification in Ultrasound Imaging via Prototypical Networks and Few-Shot Learning.

Abdillahi MA, Baker MR, Doğru A, Buyrukoğlu S

•papers•Nov 18 2025

The standard fetal-plane ultrasound images are the most basic in prenatal diagnosis, but the automated classification of these images is difficult due to the paucity of labelled data and the imbalanced class distributions. We present a data-efficient framework that combines a prototypical network with a VGG19 feature extractor (FSL-VGG19) to do few-shot learning, and we compare it to four classical convolutional neural networks (CNNs): MobileNetV2, ResNet50, VGG16, and VGG19 on three publicly available fetal-ultrasound datasets: Maternal-Fetal, FPSU 23, and Africa. Five-fold cross-validation was employed in model selection. FSL-VGG19 attained the accuracies of 96.88%, 97.80%, and 94.38% on the Maternal-Fetal, FPSU 23, and Africa datasets, respectively, outperforming all the classical CNN baselines by 1.1-24.4 percentage points. These rankings of performances were proved to be statistically significant (p < 0.05) by the non-parametric Friedman test, and the Nemenyi post-hoc test was used to confirm that the superiority of FSL-VGG19 was statistically significant relative to the baselines. The sensitivity analysis showed that there was a significant positive relationship between K-shot size and accuracy with a 30.9% difference between one-shot and ten-shot learning in the Maternal-Fetal dataset. Our framework achieved competitive results than recent state-of-the-art approaches with an order of magnitude fewer labelled images. The suggested few-shot architecture minimizes the effects of annotation bottlenecks and the class-imbalance effect, and provides robust and generalizable fetal-plane classification, which is especially applicable to resource-poor clinical environments.

Ultrasound Classification Abdominal Methodology In Silico Academic Lab Benchmark SOTA

Transformer-based ultrasound radiomics for novel multi-task automated segmentation and classification of ovarian tumors.

Mutee AF, Al-Hussainy AF, Ibrahim NA, Roopashree R, Chanania K, Karavadi B, Sharma V, Sinha A, Khamidov O, Sameer HN, Salih RM, Adil M, Farhood B

•papers•Nov 18 2025

To develop and evaluate a comprehensive AI-driven pipeline for automated segmentation and multi-class classification of ovarian tumors in ultrasound images using advanced transformer-based models and radiomic analysis. The study was a multi-center, retrospective study that included ultrasound data of 1,907 patients in 4 clinical centers, with 1,364 patients being used as a model training and validation and 543 as an external test. Five transformer-based segmentation models were used to outline tumour regions namely SegFormer, Swin-UNet, DPT, nnU-Net and HRFormer. Dice Similarity Coefficient (DSC), Hausdorff Distance (HD) and Relative Absolute Volume Difference (RAVD) were used to measure segmentation accuracy. Radiomic features (n = 215) were taken out through SERA platform, following the IBSI instructions. Mutual Information (MI), LASSO and Recursive Feature Elimination (RFE) were used as feature selection methods. Five models were considered to classify them: TabTransformer, TabNet, MLP, XGBoost, and KAN. Three clinically significant classification tasks, including tumour type, histological subtype, and FIGO staging, were covered. The models were checked by five-fold cross-validation and externally tested. SHAP analysis delivered interpretability. SegFormer was found to be the most accurate in terms of segmentation (mean DSC > 91%), which is higher than other U-Net-based models in earlier research. To classify, the TabTransformer selected by MI had the highest test accuracies in all tasks: tumour type (92.7 per cent.), histological subtype (92.1 per cent.), and FIGO stage (92.0 per cent). The accuracy of the external tests was over 90 per cent in all the tasks. SHAP analysis demonstrated the important radiomic features that make classification decisions, which is clinically transparent. There was no difference in performance between centers and imaging conditions. The suggested pipeline proves to be very accurate, general and interpretable in automated ovarian cancer diagnostics based on ultrasound imaging. It provides a powerful framework of clinical implementation and future application with multimodal data to support decisions better.

Ultrasound Segmentation Abdominal Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Abdominal multi-organ segmentation on 3D negative-contrast CT cholangiopancreatography: a comparative study of deep learning methods.

Li B, Zhang J, Fu H, Zhang L, Chen FM, Zhao W, Zou JZ, Zou H, Zhang ZY

•papers•Nov 18 2025

This study aims to automate segmentation of the biliary and pancreatic systems on 3D negative-contrast CT cholangiopancreatography (3D-nCTCP) to improve preoperative planning and diagnosis. We retrospectively collected dual-phase enhanced CT data from 111 patients with malignant low biliary obstruction. Portal phase data were processed and annotated by an expert radiologist. The dataset, comprising 25,700 images, was split into 91 patients for training/validation and 20 patients for testing. Four state-of-the-art segmentation models, namely TransUNet 2D, nnU-Net 2D, Swin-UNETR 2D, and Swin-UNETR 3D, were implemented and quantitatively compared. Model performance was evaluated using the Dice Similarity Coefficient (DSC) and the Average Symmetric Surface Distance (ASSD), with the inter-observer variability (IOV) serving as the clinical benchmark. Across all models, segmentation performance exhibited high accuracy for the liver, with notably lower accuracy for smaller, more complex organs (duodenum, pancreas, biliary system). The Swin-UNETR 3D model demonstrated superior overall segmentation performance, particularly for challenging organs, with competitive stability. Swin-UNETR 3D achieved a mean DSC of 96.12% ± 1.09% and ASSD of 4.60 mm ± 8.25 mm for the liver, mean DSC of 75.31% ± 11.31% and ASSD of 4.42 mm ± 5.84 mm for the duodenum, mean DSC of 81.00% ± 6.33% and ASSD of 2.07 mm ± 1.70 mm for the pancreas, and mean DSC of 88.64% ± 6.74% and ASSD of 5.47 mm ± 12.13 mm for the biliary system. The 3D volumetric approach in Swin-UNETR 3D outperformed its 2D counterparts (TransUNet 2D, nnU-Net 2D, Swin-UNETR 2D) in most metrics, particularly for the duodenum, where it achieved the highest mean DSC and improved boundary localization compared to 2D models. The comparative analysis demonstrates the superiority of 3D volumetric models (Swin-UNETR 3D) over 2D models for accurate and stable abdominal multi-organ segmentation on 3D-nCTCP, reduces manual annotation time, and may aid broader clinical adoption.

CT Segmentation Abdominal Retrospective Clinical In Silico Academic Lab

Feasibility of deep learning-accelerated HASTE-FS for pancreatic cystic lesion surveillance: comparison with conventional HASTE and MRCP.

Le L, Ginocchio LA, Kim S, Chandarana H, Lovett JT, Huang C

•papers•Nov 18 2025

Pancreatic cystic lesions (PCL) commonly undergo surveillance using MRI with MR cholangiopancreatography (MRCP). Our objective is to compare the performance of a single-shot fat-saturated T2-weighted technique with deep-learning reconstruction (DL HASTE-FS) to a conventional T2-weighted Half fourier Single-shot Turbo spin-Echo (HASTE) sequence and to MRCP for the purpose of PCL detection, characterization, and surveillance. In this retrospective study, 91 consecutive patients underwent 3T abdominal MRI with MRCP protocol including DL HASTE-FS and conventional HASTE between 8/2/2023 and 10/3/2023. Three abdominal radiologists rated overall and lesion-specific image quality on a 5-point Likert scale, including pancreatic margin and duct sharpness, and PCL conspicuity. A subset of 70 preselected index PCLs were evaluated for cyst features, confidence of diagnosing side-branch IPMN, and suitability of DL HASTE-FS in replacing MRCP for PCL surveillance. DL HASTE-FS received higher scores for pancreatic duct border sharpness (4.1 vs. 3.9; p = .004), pancreatic duct visibility compared to MRCP (2.0 vs. 1.9; p = .04), cyst conspicuity (4.4 vs. 3.9; p < .001), and sharpness of cyst wall and internal septations (4.3 vs. 3.7; p < .001) compared to conventional HASTE. In contrast, conventional HASTE received higher scores for pancreatic margin sharpness (4.2 vs. 3.8; p < .001) and peripancreatic vessel clarity (4.2 vs. 3.4; p < .001). For the 70 preselected index PCLs, readers visualized more PCLs and had higher confidence in diagnosing SB-IPMN on DL HASTE-FS than on conventional HASTE (3.6 vs. 3.4; p < .001). Finally, DL HASTE-FS was deemed a suitable replacement to MRCP for more cases than conventional HASTE (83% vs. 48%; p < .001). DL HASTE-FS outperforms conventional HASTE for PCL detection and characterization, and is a suitable alternative to 3D MRCP in the context of PCL surveillance, potentially reducing exam time and cost.

MRI Reconstruction Abdominal Retrospective Clinical In Silico Academic Lab

Deep-Learning Virtual Superior Mesenteric Artery Modeling for Risk Stratification in Pancreas Surgery.

Mellado S, Vega EA, Yamane K, Salirrosas O, Chirban AM, Panettieri E, Moskal J, Kawano F, Alshammary S, Hatano E, Conrad C, Ogiso S

•papers•Nov 18 2025

Understanding the patient-specific anatomy of the superior mesenteric artery (SMA) and its branches is of critical importance when performing a pancreatic surgery. This study assesses deep-learning-based virtual SMA modelling for three-dimensional (3D) visualization of SMA's course and branching patterns. This model is then used to correlate anatomical features with intra-/postoperative outcomes. Preoperative computed tomography (CT) scans of 124 patients undergoing pancreatic resection for pancreatic malignancy at St. Elizabeth's Medical Center and Kyoto University were analyzed for course, branching, caliber, and aortic angle using a deep learning modeling software. Following anatomic modelling, the SMA was divided into regions on the basis of its relationship to the pancreas: SMA1 (above pancreas), SMA2 (intrapancreatic), and SMA3 (below pancreas). Univariate and multivariate logistic and linear regression were used to compare anatomical measurements to perioperative outcomes. Differences in anatomic measurements were observed between both populations. The mean caliber of SMA1, SMA2, and SMA3 was 7.05, 6.20, and 5.69 mm, respectively. A mean of 2.21 branches were observed in SMA2, and 4.52 in SMA3. Furthermore, fewer branches in SMA2 was associated with both postoperative pancreatic fistula (POPF) and Clavien-Dindo complication grade ≥ III. Finally, when stratified by minimally invasive approach, a greater distance between the superior border of pancreas and SMA was associated with POPF. This study shows that deep-learning-based virtual three-dimensional reconstruction of SMA enables accurate assessment of the anatomical relationship between the pancreas and SMA. Specific anatomical features were found to be associated with intra- and postoperative outcomes. Therefore, SMA modeling not only contributes to improved preoperative planning and intraoperative navigation, but also to outcome prognostication.

CT Segmentation Abdominal Retrospective Clinical In Silico Academic Lab

Total-body [18F]FDG-PET/CT imaging of healthy volunteers with minimal effective dose.

Ferrara D, Gutschmayer S, Chalampalakis Z, Geist BK, Özer Ö, Pires M, Rausch I, Langsteger W, Beyer T

•papers•Nov 18 2025

High-sensitivity, total-body (TB) positron emission tomography (PET) and computed tomography (CT) imaging systems enable substantial reduction of injected radioactivity without compromising image quality. Synthetic CT-like attenuation maps can be generated from PET data via deep learning (DL) to further minimise subject radiation exposure. We explored combining TB-PET with DL-derived attenuation maps to minimise effective dose in healthy subjects undergoing TB-PET/CT imaging with [18F]Fluorodeoxyglucose ([18F]FDG). 47 healthy Caucasians (25 F/22 M, BMI: 24 ± 3 kg/m²) underwent TB-PET/CT imaging. After 6-hour fasting, subjects received low-dose CT (1 mSv) and (109 ± 7) MBq [18F]FDG, followed by a 62-minute dynamic PET acquisition (supine, arms down). PET data from 57 to 62 min were down-sampled to simulate reduced activities (50%, 25%, 10%, 5%). Effective doses (ED) were estimated for each activity level. Synthetic CTs (ED = 0 mSv) were generated from PET raw data (at all activity levels) and used to reconstruct attenuation-corrected PETs, which were compared to the original images. Organ-level segmentation enabled quantification of Standardized Uptake Values normalised to body weight (SUVbw) and coefficients of variation (CV). Across the cohort, organ-based SUVbw differences remained < 10% versus reference PET for simulated activities down to 10%. At 25% activity (~ 25 MBq, ED~ 0.45 mSv), PET quantification remained robust, though CV increased in skeletal muscle and fat. At 5% activity, SUVbw deviations exceeded 10% in several organs. Total-body [18F]FDG-PET/CT enables reliable organ-level quantification (%-differences < 10%) at injected activities as low as ~ 25 MBq. Such low-dose protocols may support the creation of reference datasets of healthy controls while minimising radiation exposure to subjects and staff.

Mixed Modality Image Synthesis Whole Body Retrospective Clinical In Silico Academic Lab

Analysis of deep learning-based segmentation of lymph nodes on full-dose and reduced-dose body CT.

Bloom LH, Mathai TS, Liu B, Khoury B, Patel N, Wei O, Balamuralikrishna PTS, Hou B, Solomon J, Pucar D, Samei E, Jones EC, Summers RM

•papers•Nov 18 2025

The performance of fully automated deep learning-based models for the detection and segmentation of lymph nodes (LNs) on full- and simulated reduced-dose CT was validated. A total of 15,341 LNs were annotated in 151 patient CTs (age 52 ± 14 years, 87 males) from the public TCIA NIH CT Lymph Nodes dataset. Two 3D nnU-Net models were trained on 90 CT scans: (1) only full dose CTs (NoAugmentation), and (2) both full- and reduced-dose CTs (Augmentation). Dose reduction from 75% to 5% of the full-dose was simulated using a noise-addition tool. Performance was validated on the remaining 61 CTs and an external TCIA Mediastinal LNQ dataset (120 CTs, 64 females). On 61 full-dose CTs, the Augmentation model detected all LNs with 67.3% precision and 84.6% sensitivity. For all LNs and large nodes (short axis diameter ≥ 8 mm), Dice Similarity Coefficient (DSC) was 0.83 ± 0.07 and 0.80 ± 0.14, while Hausdorff Distance (HD) error was 1.47 ± 0.91 mm and 3.2 ± 2.28 mm, respectively. Performance decreased with dose reduction (p < 0.01), reaching 73.8% detection sensitivity and 0.75 DSC at 5% dose. Statistically significant differences between Augmentation vs. NoAugmentation models were seen for all nodes (p < 0.001) and small nodes (p < 0.05) at 10% and 5% doses. On the external LNQ dataset, the Augmentation model attained a DSC of 0.76 ± 0.12 and HD of 4.7 ± 3.23 (p < 0.01) for all LNs. Degraded image quality impacted nodal delineation on reduced-dose CT. Performance improved when a model trained on both full- and reduced-dose CTs was used.

CT Segmentation Abdominal Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Filter Papers

Tags

Leveraging Artificial Intelligence to Transform Thoracic Radiology for Lung Nodules and Lung Cancer: Applications, Challenges, and Future Directions.

The role of interdisciplinary collaboration and artificial intelligence in radiology residency education.

Lymphoma classification with multi-parametric texture analysis of DWI and PET imaging in Hodgkin and non-Hodgkin lymphoma: a pilot study.

Enhanced Fetal Plane Classification in Ultrasound Imaging via Prototypical Networks and Few-Shot Learning.

Transformer-based ultrasound radiomics for novel multi-task automated segmentation and classification of ovarian tumors.

Abdominal multi-organ segmentation on 3D negative-contrast CT cholangiopancreatography: a comparative study of deep learning methods.

Feasibility of deep learning-accelerated HASTE-FS for pancreatic cystic lesion surveillance: comparison with conventional HASTE and MRCP.

Deep-Learning Virtual Superior Mesenteric Artery Modeling for Risk Stratification in Pancreas Surgery.

Total-body [<sup>18</sup>F]FDG-PET/CT imaging of healthy volunteers with minimal effective dose.

Analysis of deep learning-based segmentation of lymph nodes on full-dose and reduced-dose body CT.

Ready to Sharpen Your Edge?