Latest Papers on Radiology AI. Order: Best Match, Limit: 10.

MRI radiomics model for predicting tumor immune microenvironment types and efficacy of anti-PD-1/PD-L1 therapy in hepatocellular carcinoma.

Zhang R, Peng W, Wang Y, Jiang Y, Wang J, Zhang S, Li Z, Shi Y, Chen F, Feng Z, Xiao W

•papers•Jul 1 2025

To improve the prediction of immune checkpoint inhibitors (ICIs) efficacy in hepatocellular carcinoma (HCC), this study categorized the tumor immune microenvironment (TIME) into two types: immune-activated (IA), characterized by a high CD8 + score and high PD-L1 combined positive score (CPS), and non-immune-activated (NIA), encompassing all other conditions. We aimed to develop an MRI-based radiomics model to predict TIME types and validate its predictive capability for ICIs efficacy in HCC patients receiving anti-PD-1/PD-L1 therapy. The study included 200 HCC patients who underwent preoperative/pretreatment multiparametric contrast-enhanced MRI (Cohort 1: 168 HCC patients with hepatectomy from two centres; Cohort 2: 42 advanced HCC patients on anti-PD-1/PD-L1 therapy). In Cohort 1, after feature selection, clinical, intratumoral radiomics, peritumoral radiomics, combined radiomics, and clinical-radiomics models were established using machine learning algorithms. In cohort 2, the clinical-radiomics model's predictive ability for ICIs efficacy was assessed. In Cohort 1, the AUC values for intratumoral, peritumoral, and combined radiomics models were 0.825, 0.809, and 0.868, respectively, in the internal validation set, and 0.73, 0.759, and 0.822 in the external validation set; the clinical-radiomics model incorporating neutrophil-to-lymphocyte ratio, tumor size, and combined radiomics score achieved an AUC of 0.887 in the internal validation set, outperforming clinical model (P = 0.049), and an AUC of 0.837 in the external validation set. In cohort 2, the clinical-radiomics model stratified patients into low- and high-score groups, demonstrating a significant difference in objective response rate (p = 0.003) and progression-free survival (p = 0.031). The clinical-radiomics model is effective in predicting TIME types and efficacy of ICIs in HCC, potentially aiding in treatment decision-making.

MRI Classification Abdominal Retrospective Clinical In Silico Academic Lab

Federated learning-based CT liver tumor detection using a teacher‒student SANet with semisupervised learning.

Lee CS, Lien JJ, Chain K, Huang LC, Hsu ZW

•papers•Jul 1 2025

Detecting liver tumors via computed tomography (CT) scans is a critical but labor-intensive task. Extensive expert annotations are needed to train effective machine learning models. This study presents an innovative approach that leverages federated learning in combination with a teacher‒student framework, an enhanced slice-aware network (SANet), and semisupervised learning (SSL) techniques to improve the CT-based liver tumor detection process while significantly reducing its labor and time costs. Federated learning enables collaborative model training to be performed across multiple institutions without sharing sensitive patient data, thus ensuring privacy and security. The teacher-student SANet framework takes advantage of both teacher and student models, with the teacher model providing reliable pseudolabels that guide the student model in a semisupervised manner. This method not only improves the accuracy of liver tumor detection but also reduces the dependence on extensively annotated datasets. The proposed method was validated through simulation experiments conducted in four scenarios, and it demonstrated a model accuracy of 83%, which represents an improvement over the original locally trained models. This study presents a promising method for enhancing the CT-based liver tumor detection while reducing the incurred labor and time costs by utilizing federated learning, the teacher-student SANet framework, and SSL techniques. Compared with previous approaches, the proposed method achieved a model accuracy of 83%, representing a significant improvement. Not applicable.

CT Detection Abdominal Methodology In Silico Academic Lab Benchmark SOTA

A novel deep learning system for automated diagnosis and grading of lumbar spinal stenosis based on spine MRI: model development and validation.

Wang T, Wang A, Zhang Y, Liu X, Fan N, Yuan S, Du P, Wu Q, Chen R, Xi Y, Gu Z, Fei Q, Zang L

•papers•Jul 1 2025

The study aimed to develop a single-stage deep learning (DL) screening system for automated binary and multiclass grading of lumbar central stenosis (LCS), lateral recess stenosis (LRS), and lumbar foraminal stenosis (LFS). Consecutive inpatients who underwent lumbar MRI at our center were retrospectively reviewed for the internal dataset. Axial and sagittal lumbar MRI scans were collected. Based on a new MRI diagnostic criterion, all MRI studies were labeled by two spine specialists and calibrated by a third spine specialist to serve as reference standard. Furthermore, two spine clinicians labeled all MRI studies independently to compare interobserver reliability with the DL model. Samples were assigned into training, validation, and test sets at a proportion of 8:1:1. Additional patients from another center were enrolled as the external test dataset. A modified single-stage YOLOv5 network was designed for simultaneous detection of regions of interest (ROIs) and grading of LCS, LRS, and LFS. Quantitative evaluation metrics of exactitude and reliability for the model were computed. In total, 420 and 50 patients were enrolled in the internal and external datasets. High recalls of 97.4%-99.8% were achieved for ROI detection of lumbar spinal stenosis (LSS). The system revealed multigrade area under curve (AUC) values of 0.93-0.97 in the internal test set and 0.85-0.94 in the external test set for LCS, LRS, and LFS. In binary grading, the DL model achieved high sensitivities of 0.97 for LCS, 0.98 for LRS, and 0.96 for LFS, slightly better than those achieved by spine clinicians in the internal test set. In the external test set, the binary sensitivities were 0.98 for LCS, 0.96 for LRS, and 0.95 for LFS. For reliability assessment, the kappa coefficients between the DL model and reference standard were 0.92, 0.88, and 0.91 for LCS, LRS, and LFS, respectively, slightly higher than those evaluated by nonexpert spine clinicians. The authors designed a novel DL system that demonstrated promising performance, especially in sensitivity, for automated diagnosis and grading of different types of lumbar spinal stenosis using spine MRI. The reliability of the system was better than that of spine surgeons. The authors' system may serve as a triage tool for LSS to reduce misdiagnosis and optimize routine processes in clinical work.

MRI Detection Musculoskeletal Retrospective Clinical In Silico Academic Lab

A multiregional multimodal machine learning model for predicting outcome of surgery for symptomatic hemorrhagic brainstem cavernous malformations.

Dong X, Gui H, Quan K, Li Z, Xiao Y, Zhou J, Zhao Y, Wang D, Liu M, Duan H, Yang S, Lin X, Dong J, Wang L, Ma Y, Zhu W

•papers•Jul 1 2025

Given that resection of brainstem cavernous malformations (BSCMs) ends hemorrhaging but carries a high risk of neurological deficits, it is necessary to develop and validate a model predicting surgical outcomes. This study aimed to construct a BSCM surgery outcome prediction model based on clinical characteristics and T2-weighted MRI-based radiomics. Two separate cohorts of patients undergoing BSCM resection were included as discovery and validation sets. Patient characteristics and imaging data were analyzed. An unfavorable outcome was defined as a modified Rankin Scale score > 2 at the 12-month follow-up. Image features were extracted from regions of interest within lesions and adjacent brainstem. A nomogram was constructed using the risk score from the optimal model. The discovery and validation sets comprised 218 and 49 patients, respectively (mean age 40 ± 14 years, 127 females); 63 patients in the discovery set and 35 in the validation set had an unfavorable outcome. The eXtreme Gradient Boosting imaging model with selected radiomics features achieved the best performance (area under the receiver operating characteristic curve [AUC] 0.82). Patients were stratified into high- and low-risk groups based on risk scores computed from this model (optimal cutoff 0.37). The final integrative multimodal prognostic model attained an AUC of 0.90, surpassing both the imaging and clinical models alone. Inclusion of BSCM and brainstem subregion imaging data in machine learning models yielded significant predictive capability for unfavorable postoperative outcomes. The integration of specific clinical features enhanced prediction accuracy.

MRI Classification Neurological Retrospective Clinical In Silico Academic Lab

Deep learning-based clinical decision support system for intracerebral hemorrhage: an imaging-based AI-driven framework for automated hematoma segmentation and trajectory planning.

Gan Z, Xu X, Li F, Kikinis R, Zhang J, Chen X

•papers•Jul 1 2025

Intracerebral hemorrhage (ICH) remains a critical neurosurgical emergency with high mortality and long-term disability. Despite advancements in minimally invasive techniques, procedural precision remains limited by hematoma complexity and resource disparities, particularly in underserved regions where 68% of global ICH cases occur. Therefore, the authors aimed to introduce a deep learning-based decision support and planning system to democratize surgical planning and reduce operator dependence. A retrospective cohort of 347 patients (31,024 CT slices) from a single hospital (March 2016-June 2024) was analyzed. The framework integrated nnU-Net-based hematoma and skull segmentation, CT reorientation via ocular landmarks (mean angular correction 20.4° [SD 8.7°]), safety zone delineation with dual anatomical corridors, and trajectory optimization prioritizing maximum hematoma traversal and critical structure avoidance. A validated scoring system was implemented for risk stratification. With the artificial intelligence (AI)-driven system, the automated segmentation accuracy reached clinical-grade performance (Dice similarity coefficient 0.90 [SD 0.14] for hematoma and 0.99 [SD 0.035] for skull), with strong interrater reliability (intraclass correlation coefficient 0.91). For trajectory planning of supratentorial hematomas, the system achieved a low-risk trajectory in 80.8% (252/312) and a moderate-risk trajectory in 15.4% (48/312) of patients, while replanning was required due to high-risk designations in 3.8% of patients (12/312). This AI-driven system demonstrated robust efficacy for supratentorial ICH, addressing 60% of prevalent hemorrhage subtypes. While limitations remain in infratentorial hematomas, this novel automated hematoma segmentation and surgical planning system could be helpful in assisting less-experienced neurosurgeons with limited resources in primary healthcare settings.

CT Segmentation Neurological Retrospective Clinical In Silico Academic Lab

Generation of synthetic CT-like imaging of the spine from biplanar radiographs: comparison of different deep learning architectures.

Bottini M, Zanier O, Da Mutten R, Gandia-Gonzalez ML, Edström E, Elmi-Terander A, Regli L, Serra C, Staartjes VE

•papers•Jul 1 2025

This study compared two deep learning architectures-generative adversarial networks (GANs) and convolutional neural networks combined with implicit neural representations (CNN-INRs)-for generating synthetic CT (sCT) images of the spine from biplanar radiographs. The aim of the study was to identify the most robust and clinically viable approach for this potential intraoperative imaging technique. A spine CT dataset of 216 training and 54 validation cases was used. Digitally reconstructed radiographs (DRRs) served as 2D inputs for training both models under identical conditions for 170 epochs. Evaluation metrics included the Structural Similarity Index Measure (SSIM), peak signal-to-noise ratio (PSNR), and cosine similarity (CS), complemented by qualitative assessments of anatomical fidelity. The GAN model achieved a mean SSIM of 0.932 ± 0.015, PSNR of 19.85 ± 1.40 dB, and CS of 0.671 ± 0.177. The CNN-INR model demonstrated a mean SSIM of 0.921 ± 0.015, PSNR of 21.96 ± 1.20 dB, and CS of 0.707 ± 0.114. Statistical analysis revealed significant differences for SSIM (p = 0.001) and PSNR (p < 0.001), while CS differences were not statistically significant (p = 0.667). Qualitative evaluations consistently favored the GAN model, which produced more anatomically detailed and visually realistic sCT images. This study demonstrated the feasibility of generating spine sCT images from biplanar radiographs using GAN and CNN-INR models. While neither model achieved clinical-grade outputs, the GAN architecture showed greater potential for generating anatomically accurate and visually realistic images. These findings highlight the promise of sCT image generation from biplanar radiographs as an innovative approach to reducing radiation exposure and improving imaging accessibility, with GANs emerging as the more promising avenue for further research and clinical integration.

Mixed Modality Image Synthesis Musculoskeletal Methodology In Silico Academic Lab

Does alignment alone predict mechanical complications after adult spinal deformity surgery? A machine learning comparison of alignment, bone quality, and soft tissue.

Sundrani S, Doss DJ, Johnson GW, Jain H, Zakieh O, Wegner AM, Lugo-Pico JG, Abtahi AM, Stephens BF, Zuckerman SL

•papers•Jul 1 2025

Mechanical complications are a vexing occurrence after adult spinal deformity (ASD) surgery. While achieving ideal spinal alignment in ASD surgery is critical, alignment alone may not fully explain all mechanical complications. The authors sought to determine which combination of inputs produced the most sensitive and specific machine learning model to predict mechanical complications using postoperative alignment, bone quality, and soft tissue data. A retrospective cohort study was performed in patients undergoing ASD surgery from 2009 to 2021. Inclusion criteria were a fusion ≥ 5 levels, sagittal/coronal deformity, and at least 2 years of follow-up. The primary exposure variables were 1) alignment, evaluated in both the sagittal and coronal planes using the L1-pelvic angle ± 3°, L4-S1 lordosis, sagittal vertical axis, pelvic tilt, and coronal vertical axis; 2) bone quality, evaluated by the T-score from a dual-energy x-ray absorptiometry scan; and 3) soft tissue, evaluated by the paraspinal muscle-to-vertebral body ratio and fatty infiltration. The primary outcome was mechanical complications. Alongside demographic data in each model, 7 machine learning models with all combinations of domains (alignment, bone quality, and soft tissue) were trained. The positive predictive value (PPV) was calculated for each model. Of 231 patients (24% male) undergoing ASD surgery with a mean age of 64 ± 17 years, 147 (64%) developed at least one mechanical complication. The model with alignment alone performed poorly, with a PPV of 0.85. However, the model with alignment, bone quality, and soft tissue achieved a high PPV of 0.90, sensitivity of 0.67, and specificity of 0.84. Moreover, the model with alignment alone failed to predict 15 complications of 100, whereas the model with all three domains only failed to predict 10 of 100. These results support the notion that not every mechanical failure is explained by alignment alone. The authors found that a combination of alignment, bone quality, and soft tissue provided the most accurate prediction of mechanical complications after ASD surgery. While achieving optimal alignment is essential, additional data including bone and soft tissue are necessary to minimize mechanical complications.

X-Ray Classification Musculoskeletal Retrospective Clinical In Silico Academic Lab

Magnetic resonance imaging of cruciate ligament disorders: current updates.

Yang T, Li Y, Yang L, Liu Q

•papers•Jul 1 2025

While conventional structural magnetic resonance imaging (MRI) can detect cruciate ligament anatomy and injuries, it has inherent limitations. Recently, novel MRI technologies such as quantitative MRI and artificial intelligence (AI) have emerged to mitigate these shortcomings, providing critical quantitative insights beyond gross morphological imaging and poised to expand current knowledge in assessing cruciate ligament injuries and to facilitate clinical decision making. Quantitative MRI serves as a noninvasive histological and quantification tool, which significantly improves the evaluation of degeneration and repair processes. AI plays a crucial role in automating radiological estimations and enabling data-driven predictions of future events. Despite the transformative impact of advanced MRI techniques on the analytical and diagnostic algorithms related to cruciate ligament disorders, future efforts are warranted to address challenges such as economic burdens and ethical considerations.

MRI Detection Musculoskeletal Review Concept Ethics

Orbital CT deep learning models in thyroid eye disease rival medical specialists' performance in optic neuropathy prediction in a quaternary referral center and revealed impact of the bony walls.

Kheok SW, Hu G, Lee MH, Wong CP, Zheng K, Htoon HM, Lei Z, Tan ASM, Chan LL, Ooi BC, Seah LL

•papers•Jul 1 2025

To develop and evaluate orbital CT deep learning (DL) models in optic neuropathy (ON) prediction in patients diagnosed with thyroid eye disease (TED), using partial versus entire 2D versus 3D images for input. Patients with TED ±ON diagnosed at a quaternary-level practice and who underwent orbital CT between 2002 and 2017 were included. DL models were developed using annotated CT data. The DL models were used to evaluate the hold-out test set. ON classification performances were compared between models and medical specialists, and saliency maps applied to randomized cases. 36/252 orbits in 126 TED patients (mean age, 51 years; 81 women) had clinically confirmed ON. With 2D image input for ON prediction, our models achieved (a) sensitivity 89%, AUC 0.86 on entire coronal orbital apex including bony walls, and (b) specificity 92%, AUC 0.79 on partial axial lateral orbital wall only annotations. ON classification performance was similar (p = 0.58) between DL model and medical specialists. DL models trained on 2D CT annotations rival medical specialists in ON classification, with potential to objectively enhance clinical triage for sight-saving intervention and incorporate model variants in the workflow to harness differential performance metrics.

CT Classification Neurological Retrospective Clinical In Silico Academic Lab Benchmark SOTA

18F-FDG dose reduction using deep learning-based PET reconstruction.

Akita R, Takauchi K, Ishibashi M, Kondo S, Ono S, Yokomachi K, Ochi Y, Kiguchi M, Mitani H, Nakamura Y, Awai K

•papers•Jul 1 2025

A deep learning-based image reconstruction (DLR) algorithm that can reduce the statistical noise has been developed for PET/CT imaging. It may reduce the administered dose of 18F-FDG and minimize radiation exposure while maintaining diagnostic quality. This retrospective study evaluated whether the injected 18F-FDG dose could be reduced by applying DLR to PET images. To this aim, we compared the quantitative image quality metrics and the false-positive rate between DLR with a reduced 18F-FDG dose and Ordered Subsets Expectation Maximization (OSEM) with a standard dose. This study included 90 oncology patients who underwent 18F-FDG PET/CT. They were divided into 3 groups (30 patients each): group A (18F-FDG dose per body weight [BW]: 2.00-2.99 MBq/kg; PET image reconstruction: DLR), group B (3.00-3.99 MBq/kg; DLR), and group C (standard dose group; 4.00-4.99 MBq/kg; OSEM). The evaluation was performed using the signal-to-noise ratio (SNR), target-to-background ratio (TBR), and false-positive rate. DLR yielded significantly higher SNRs in groups A and B than group C (p < 0.001). There was no significant difference in the TBR between groups A and C, and between groups B and C (p = 0.983 and 0.605, respectively). In group B, more than 80% of patients weighing less than 75 kg had at most one false positive result. In contrast, in group B patients weighing 75 kg or more, as well as in group A, less than 80% of patients had at most one false-positives. Our findings suggest that the injected 18F-FDG dose can be reduced to 3.0 MBq/kg in patients weighing less than 75 kg by applying DLR. Compared to the recommended dose in the European Association of Nuclear Medicine (EANM) guidelines for 90 s per bed position (4.7 MBq/kg), this represents a dose reduction of 36%. Further optimization of DLR algorithms is required to maintain comparable diagnostic accuracy in patients weighing 75 kg or more.

PET Reconstruction Whole Body Retrospective Clinical In Silico Academic Lab Benchmark SOTA

MRI radiomics model for predicting tumor immune microenvironment types and efficacy of anti-PD-1/PD-L1 therapy in hepatocellular carcinoma.

Federated learning-based CT liver tumor detection using a teacher‒student SANet with semisupervised learning.

A novel deep learning system for automated diagnosis and grading of lumbar spinal stenosis based on spine MRI: model development and validation.

A multiregional multimodal machine learning model for predicting outcome of surgery for symptomatic hemorrhagic brainstem cavernous malformations.

Deep learning-based clinical decision support system for intracerebral hemorrhage: an imaging-based AI-driven framework for automated hematoma segmentation and trajectory planning.

Generation of synthetic CT-like imaging of the spine from biplanar radiographs: comparison of different deep learning architectures.

Does alignment alone predict mechanical complications after adult spinal deformity surgery? A machine learning comparison of alignment, bone quality, and soft tissue.

Magnetic resonance imaging of cruciate ligament disorders: current updates.

Orbital CT deep learning models in thyroid eye disease rival medical specialists' performance in optic neuropathy prediction in a quaternary referral center and revealed impact of the bony walls.

<sup>18</sup>F-FDG dose reduction using deep learning-based PET reconstruction.

Ready to Sharpen Your Edge?