Latest Papers on Radiology AI.

Synthetic data trained open-source language models are feasible alternatives to proprietary models for radiology reporting.

Pandita A, Keniston A, Madhuripan N

•papers•Jul 23 2025

The study assessed the feasibility of using synthetic data to fine-tune various open-source LLMs for free text to structured data conversation in radiology, comparing their performance with GPT models. A training set of 3000 synthetic thyroid nodule dictations was generated to train six open-source models (Starcoderbase-1B, Starcoderbase-3B, Mistral-7B, Llama-3-8B, Llama-2-13B, and Yi-34B). ACR TI-RADS template was the target model output. The model performance was tested on 50 thyroid nodule dictations from MIMIC-III patient dataset and compared against 0-shot, 1-shot, and 5-shot performance of GPT-3.5 and GPT-4. GPT-4 5-shot and Yi-34B showed the highest performance with no statistically significant difference between the models. Various open models outperformed GPT models with statistical significance. Overall, models trained with synthetic data showed performance comparable to GPT models in structured text conversion in our study. Given privacy preserving advantages, open LLMs can be utilized as a viable alternative to proprietary GPT models.

Ultrasound LLM Radiology Report Abdominal Methodology In Silico Academic Lab GenAI Open Code

CT-based intratumoral and peritumoral radiomics to predict the treatment response to hepatic arterial infusion chemotherapy plus lenvatinib and PD-1 in high-risk hepatocellular carcinoma cases: a multi-center study.

Liu Z, Li X, Huang Y, Chang X, Zhang H, Wu X, Diao Y, He F, Sun J, Feng B, Liang H

•papers•Jul 23 2025

Noninvasive and precise tools for treatment response estimation in patients with high-risk hepatocellular carcinoma (HCC) who could benefit from hepatic arterial infusion chemotherapy (HAIC) plus lenvatinib and humanized programmed death receptor-1 inhibitors (PD-1) (HAIC-LEN-PD1) are lacking. This study aimed to evaluate the predictive potential of intratumoral and peritumoral radiomics for preoperative treatment response assessment to HAIC-LEN-PD1 in high-risk HCC cases. Totally 630 high-risk HCC cases administered HAIC-LEN-PD1 at three institutions were retrospectively identified and assigned to training, validation and external test sets. Totally 1834 radiomic features were, respectively, obtained from intratumoral and peritumoral regions and radiomics models were established using five classifiers. Based on the optimal model, a nomogram was developed and evaluated using areas under the curves (AUCs), calibration curves and decision curve analysis (DCA). Overall survival (OS) and progression-free survival (PFS) were assessed by Kaplan-Meier curves. The Intratumoral + Peritumoral 10 mm (Intra + Peri10) radiomics models were superior to the intratumor models and peritumor models, with AUCs of 0.919 (95%CI 0.889-0.949) in the training set, 0.874 (95%CI 0.812-0.936) in validation set and 0.893 (95%CI 0.839-0.948) in external test sets. The nomogram had good calibration ability and clinical value, with the AUCs of 0.936 (95%CI 0.907-0.965) in the training set, 0.878 (95%CI 0.916-0.940) in validation set and 0.902 (95%CI 0.848-0.957) in external test sets. The Kaplan-Meier analysis showed that high-score patients had significantly shorter OS and PFS than the low-score patients (median OS: 11.7 vs. 29.6 months, the whole set, p < 0.001; median PFS: 6.0 vs. 12.0 months, the whole set, p < 0.001). The Intra + Peri10 model can effectively predict the treatment response of high-risk HCC cases administered HAIC-LEN-PD1. The nomogram could provide an effective tool to evaluate the treatment response and risk stratification.

CT Classification Abdominal Retrospective Clinical In Silico

Hierarchical Diffusion Framework for Pseudo-Healthy Brain MRI Inpainting with Enhanced 3D Consistency

Dou Hoon Kwark, Shirui Luo, Xiyue Zhu, Yudu Li, Zhi-Pei Liang, Volodymyr Kindratenko

•preprint•Jul 23 2025

Pseudo-healthy image inpainting is an essential preprocessing step for analyzing pathological brain MRI scans. Most current inpainting methods favor slice-wise 2D models for their high in-plane fidelity, but their independence across slices produces discontinuities in the volume. Fully 3D models alleviate this issue, but their high model capacity demands extensive training data for reliable, high-fidelity synthesis -- often impractical in medical settings. We address these limitations with a hierarchical diffusion framework by replacing direct 3D modeling with two perpendicular coarse-to-fine 2D stages. An axial diffusion model first yields a coarse, globally consistent inpainting; a coronal diffusion model then refines anatomical details. By combining perpendicular spatial views with adaptive resampling, our method balances data efficiency and volumetric consistency. Our experiments show our approach outperforms state-of-the-art baselines in both realism and volumetric consistency, making it a promising solution for pseudo-healthy image inpainting. Code is available at https://github.com/dou0000/3dMRI-Consistent-Inpaint.

MRI Image Synthesis Neurological Methodology In Silico Academic Lab Open Code

Benchmarking of Deep Learning Methods for Generic MRI Multi-OrganAbdominal Segmentation

Deepa Krishnaswamy, Cosmin Ciausu, Steve Pieper, Ron Kikinis, Benjamin Billot, Andrey Fedorov

•preprint•Jul 23 2025

Recent advances in deep learning have led to robust automated tools for segmentation of abdominal computed tomography (CT). Meanwhile, segmentation of magnetic resonance imaging (MRI) is substantially more challenging due to the inherent signal variability and the increased effort required for annotating training datasets. Hence, existing approaches are trained on limited sets of MRI sequences, which might limit their generalizability. To characterize the landscape of MRI abdominal segmentation tools, we present here a comprehensive benchmarking of the three state-of-the-art and open-source models: MRSegmentator, MRISegmentator-Abdomen, and TotalSegmentator MRI. Since these models are trained using labor-intensive manual annotation cycles, we also introduce and evaluate ABDSynth, a SynthSeg-based model purely trained on widely available CT segmentations (no real images). More generally, we assess accuracy and generalizability by leveraging three public datasets (not seen by any of the evaluated methods during their training), which span all major manufacturers, five MRI sequences, as well as a variety of subject conditions, voxel resolutions, and fields-of-view. Our results reveal that MRSegmentator achieves the best performance and is most generalizable. In contrast, ABDSynth yields slightly less accurate results, but its relaxed requirements in training data make it an alternative when the annotation budget is limited. The evaluation code and datasets are given for future benchmarking at https://github.com/deepakri201/AbdoBench, along with inference code and weights for ABDSynth.

MRI Segmentation Abdominal Methodology In Silico Academic Lab Benchmark SOTA Open Dataset Open Code

Artificial Intelligence for Detecting Pulmonary Embolisms <i>via</i> CT: A Workflow-oriented Implementation.

Abed S, Hergan K, Dörrenberg J, Brandstetter L, Lauschmann M

•papers•Jul 23 2025

Detecting Pulmonary Embolism (PE) is critical for effective patient care, and Artificial Intelligence (AI) has shown promise in supporting radiologists in this task. Integrating AI into radiology workflows requires not only evaluation of its diagnostic accuracy but also assessment of its acceptance among clinical staff. This study aims to evaluate the performance of an AI algorithm in detecting pulmonary embolisms (PEs) on contrast-enhanced computed tomography pulmonary angiograms (CTPAs) and to assess the level of acceptance of the algorithm among radiology department staff. This retrospective study analyzed anonymized computed tomography pulmonary angiography (CTPA) data from a university clinic. Surveys were conducted at three and nine months after the implementation of a commercially available AI algorithm designed to flag CTPA scans with suspected PE. A thoracic radiologist and a cardiac radiologist served as the reference standard for evaluating the performance of the algorithm. The AI analyzed 59 CTPA cases during the initial evaluation and 46 cases in the follow-up assessment. In the first evaluation, the AI algorithm demonstrated a sensitivity of 84.6% and a specificity of 94.3%. By the second evaluation, its performance had improved, achieving a sensitivity of 90.9% and a specificity of 96.7%. Radiologists' acceptance of the AI tool increased over time. Nevertheless, despite this growing acceptance, many radiologists expressed a preference for hiring an additional physician over adopting the AI solution if the costs were comparable. Our study demonstrated high sensitivity and specificity of the AI algorithm, with improved performance over time and a reduced rate of unanalyzed scans. These improvements likely reflect both algorithmic refinement and better data integration. Departmental feedback indicated growing user confidence and trust in the tool. However, many radiologists continued to prefer the addition of a resident over reliance on the algorithm. Overall, the AI showed promise as a supportive "second-look" tool in emergency radiology settings. The AI algorithm demonstrated diagnostic performance comparable to that reported in similar studies for detecting PE on CTPA, with both sensitivity and specificity showing improvement over time. Radiologists' acceptance of the algorithm increased throughout the study period, underscoring its potential as a complementary tool to physician expertise in clinical practice.

CT Detection Chest Retrospective Clinical Clinical Pilot Academic Lab

Deep Learning-Based Prediction of Microvascular Invasion and Survival Outcomes in Hepatocellular Carcinoma Using Dual-phase CT Imaging of Tumors and Lesser Omental Adipose: A Multicenter Study.

Miao S, Sun M, Li X, Wang M, Jiang Y, Liu Z, Wang Q, Ding X, Wang R

•papers•Jul 23 2025

Accurate preoperative prediction of microvascular invasion (MVI) in hepatocellular carcinoma (HCC) remains challenging. Current imaging biomarkers show limited predictive performance. To develop a deep learning model based on preoperative multiphase CT images of tumors and lesser omental adipose tissue (LOAT) for predicting MVI status and to analyze associated survival outcomes. This retrospective study included pathologically confirmed HCC patients from two medical centers between 2016 and 2023. A dual-branch feature fusion model based on ResNet18 was constructed, which extracted fused features from dual-phase CT images of both tumors and LOAT. The model's performance was evaluated on both internal and external test sets. Logistic regression was used to identify independent predictors of MVI. Based on MVI status, patients in the training, internal test, and external test cohorts were stratified into high- and low-risk groups, and overall survival differences were analyzed. The model incorporating LOAT features outperformed the tumor-only modality, achieving an AUC of 0.889 (95% CI: [0.882, 0.962], P=0.004) in the internal test set and 0.826 (95% CI: [0.793, 0.872], P=0.006) in the external test set. Both results surpassed the independent diagnoses of three radiologists (average AUC=0.772). Multivariate logistic regression confirmed that maximum tumor diameter and LOAT area were independent predictors of MVI. Further Cox regression analysis showed that MVI-positive patients had significantly increased mortality risks in both the internal test set (Hazard Ratio [HR]=2.246, 95% CI: [1.088, 4.637], P=0.029) and external test set (HR=3.797, 95% CI: [1.262, 11.422], P=0.018). This study is the first to use a deep learning framework integrating LOAT and tumor imaging features, improving preoperative MVI risk stratification accuracy. Independent prognostic value of LOAT has been validated in multicenter cohorts, highlighting its potential to guide personalized surgical planning.

CT Classification Abdominal Retrospective Clinical In Silico Academic Lab

To Compare the Application Value of Different Deep Learning Models Based on CT in Predicting Visceral Pleural Invasion of Non-small Cell Lung Cancer: A Retrospective, Multicenter Study.

Zhu X, Yang Y, Yan C, Xie Z, Shi H, Ji H, He L, Yang T, Wang J

•papers•Jul 23 2025

Visceral pleural invasion (VPI) indicates poor prognosis in non-small cell lung cancer (NSCLC), and upgrades T classification of NSCLC from T1 to T2 when accompanied by VPI. This study aimed to develop and validate deep learning models for the accurate prediction of VPI in patients with NSCLC, and to compare the performance of two-dimensional (2D), three-dimensional (3D), and hybrid 3D models. This retrospective study included consecutive patients with pathologically confirmed lung tumor between June 2017 and September 2022. The clinical data and preoperative imaging features of these patients were investigated and their relationships with VPI were statistically compared. Elastic fiber staining analysis results were the gold standard for diagnosis of VPI. The data of non-VPI and VPI patients were randomly divided into training cohort and validation cohort based on 8:2 and 6:4, respectively. The EfficientNet-B0_2D model and Double-head Res2Net/_F6/_F24 models were constructed, optimized and verified using two convolutional neural network model architectures-EfficientNet-B0 and Res2Net, respectively, by extracting the features of original CT images and combining specific clinical-CT features. The receiver operating characteristic curve, the area under the curve (AUC), and confusion matrix were utilized to assess the diagnostic efficiency of models. Delong test was used to compare performance between models. A total of 1931 patients with NSCLC were finally evaluated. By univariate analysis, 20 clinical-CT features were identified as risk predictors of VPI. Comparison of the diagnostic efficacy among the EfficientNet-b0_2D, Double-head Res2Net, Res2Net_F6, and Res2Net_F24 combined models revealed that Double-head Res2Net_F6 model owned the largest AUC of 0.941 among all models, followed by Double-head Res2Net (AUC=0.879), Double-head Res2Net_F24 (AUC=0.876), and EfficientNet-b0_2D (AUC=0.785). The three 3D-based models showed comparable predictive performance in the validation cohort and all outperformed the 2D model (EfficientNet-B0_2D, all P＜0.05). It is feasible to predict VPI in NSCLC with the predictive models based on deep learning, and the Double-head Res2Net_F6 model fused with six clinical-CT features showed greatest diagnostic efficacy.

CT Classification Chest Retrospective Clinical In Silico Academic Lab

CAP-Net: Carotid Artery Plaque Segmentation System Based on Computed Tomography Angiography.

Luo X, Hu B, Zhou S, Wu Q, Geng C, Zhao L, Li Y, Di R, Pu J, Geng D, Yang L

•papers•Jul 23 2025

Diagnosis of carotid plaques from head and neck CT angiography (CTA) scans is typically time-consuming and labor-intensive, leading to limited studies and unpleasant results in this area. The objective of this study is to develop a deep-learning-based model for detection and segmentation of carotid plaques using CTA images. CTA images from 1061 patients (765 male; 296 female) with 4048 carotid plaques were included and split into a 75% training-validation set and a 25% independent test set. We built a workflow involving three modified deep learning networks: a plain U-Net for coarse artery segmentation, an Attention U-Net for fine artery segmentation, a dual-channel-input ConvNeXt-based U-Net architecture for plaque segmentation, and post-processing to refine predictions and eliminate false positives. The models were trained on the training-validation set using five-fold cross-validation and further evaluated on the independent test set using comprehensive metrics for segmentation and plaque detection. The proposed workflow was evaluated in the independent test set (261 patients with 902 carotid plaques) and achieved a mean dice similarity coefficient (DSC) of 0.91±0.04 in artery segmentation, and 0.75±0.14/0.67±0.15 in plaque segmentation per artery/patient. The model detected 95.5% (861/902) plaques, including 96.6% (423/438), 95.3% (307/322), and 92.3% (131/142) of calcified, mixed, and soft plaques, with less than one (0.63±0.93) false positive plaque per patient on average. This study developed an automatic detection and segmentation deep learning-based CAP-Net for carotid plaques using CTA, which yielded promising results in identifying and delineating plaques.

CT Segmentation Vascular Retrospective Clinical In Silico Academic Lab

Benchmarking of Deep Learning Methods for Generic MRI Multi-Organ Abdominal Segmentation

Deepa Krishnaswamy, Cosmin Ciausu, Steve Pieper, Ron Kikinis, Benjamin Billot, Andrey Fedorov

•preprint•Jul 23 2025

MRI Segmentation Abdominal Methodology In Silico Academic Lab Benchmark SOTA Open Code Open Dataset

VGS-ATD: Robust Distributed Learning for Multi-Label Medical Image Classification Under Heterogeneous and Imbalanced Conditions

Zehui Zhao, Laith Alzubaidi, Haider A. Alwzwazy, Jinglan Zhang, Yuantong Gu

•preprint•Jul 23 2025

In recent years, advanced deep learning architectures have shown strong performance in medical imaging tasks. However, the traditional centralized learning paradigm poses serious privacy risks as all data is collected and trained on a single server. To mitigate this challenge, decentralized approaches such as federated learning and swarm learning have emerged, allowing model training on local nodes while sharing only model weights. While these methods enhance privacy, they struggle with heterogeneous and imbalanced data and suffer from inefficiencies due to frequent communication and the aggregation of weights. More critically, the dynamic and complex nature of clinical environments demands scalable AI systems capable of continuously learning from diverse modalities and multilabels. Yet, both centralized and decentralized models are prone to catastrophic forgetting during system expansion, often requiring full model retraining to incorporate new data. To address these limitations, we propose VGS-ATD, a novel distributed learning framework. To validate VGS-ATD, we evaluate it in experiments spanning 30 datasets and 80 independent labels across distributed nodes, VGS-ATD achieved an overall accuracy of 92.7%, outperforming centralized learning (84.9%) and swarm learning (72.99%), while federated learning failed under these conditions due to high requirements on computational resources. VGS-ATD also demonstrated strong scalability, with only a 1% drop in accuracy on existing nodes after expansion, compared to a 20% drop in centralized learning, highlighting its resilience to catastrophic forgetting. Additionally, it reduced computational costs by up to 50% relative to both centralized and swarm learning, confirming its superior efficiency and scalability.

Mixed Modality Classification Methodology In Silico Academic Lab Benchmark SOTA

Filter Papers

Tags

Synthetic data trained open-source language models are feasible alternatives to proprietary models for radiology reporting.

CT-based intratumoral and peritumoral radiomics to predict the treatment response to hepatic arterial infusion chemotherapy plus lenvatinib and PD-1 in high-risk hepatocellular carcinoma cases: a multi-center study.

Hierarchical Diffusion Framework for Pseudo-Healthy Brain MRI Inpainting with Enhanced 3D Consistency

Benchmarking of Deep Learning Methods for Generic MRI Multi-OrganAbdominal Segmentation

Artificial Intelligence for Detecting Pulmonary Embolisms <i>via</i> CT: A Workflow-oriented Implementation.

Deep Learning-Based Prediction of Microvascular Invasion and Survival Outcomes in Hepatocellular Carcinoma Using Dual-phase CT Imaging of Tumors and Lesser Omental Adipose: A Multicenter Study.

To Compare the Application Value of Different Deep Learning Models Based on CT in Predicting Visceral Pleural Invasion of Non-small Cell Lung Cancer: A Retrospective, Multicenter Study.

CAP-Net: Carotid Artery Plaque Segmentation System Based on Computed Tomography Angiography.

Benchmarking of Deep Learning Methods for Generic MRI Multi-Organ Abdominal Segmentation

VGS-ATD: Robust Distributed Learning for Multi-Label Medical Image Classification Under Heterogeneous and Imbalanced Conditions

Ready to Sharpen Your Edge?