Latest Papers on Radiology AI. Tags: None

RegGAN-based contrast-free CT enhances esophageal cancer assessment: multicenter validation of automated tumor segmentation and T-staging.

Huang X, Li W, Wang Y, Wu Q, Li P, Xu K, Huang Y

•papers•Sep 2 2025

This study aimed to develop a deep learning (DL) framework using registration-guided generative adversarial networks (RegGAN) to synthesize contrast-enhanced CT (Syn-CECT) from non-contrast CT (NCCT), enabling iodine-free esophageal cancer (EC) T-staging. A retrospective multicenter analysis included 1,092 EC patients (2013-2024) divided into training (N = 313), internal (N = 117), and external test cohorts (N = 116 and N = 546). RegGAN synthesized Syn-CECT by integrating registration and adversarial training to address NCCT-CECT misalignment. Tumor segmentation used CSSNet with hierarchical feature fusion, while T-staging employed a dual-path DL model combining radiomic features (from NCCT/Syn-CECT) and Vision Transformer-derived deep features. Performance was validated via quantitative metrics (NMAE, PSNR, SSIM), Dice scores, AUC, and reader studies comparing six clinicians with/without model assistance. RegGAN achieved Syn-CECT quality comparable to real CECT (NMAE = 0.1903, SSIM = 0.7723; visual scores: p ≥ 0.12). CSSNet produced accurate tumor segmentation (Dice = 0.89, 95% HD = 2.27 in external tests). The DL staging model outperformed machine learning (AUC = 0.7893-0.8360 vs. ≤ 0.8323), surpassing early-career clinicians (AUC = 0.641-0.757) and matching experts (AUC = 0.840). Syn-CECT-assisted clinicians improved diagnostic accuracy (AUC increase: ~ 0.1, p < 0.01), with decision curve analysis confirming clinical utility at > 35% risk threshold. The RegGAN-based framework eliminates contrast agents while maintaining diagnostic accuracy for EC segmentation (Dice > 0.88) and T-staging (AUC > 0.78). It offers a safe, cost-effective alternative for patients with iodine allergies or renal impairment and enhances diagnostic consistency across clinician experience levels. This approach addresses limitations of invasive staging and repeated contrast exposure, demonstrating transformative potential for resource-limited settings.

CT Segmentation Abdominal Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Overcoming Site Variability in Multisite fMRI Studies: an Autoencoder Framework for Enhanced Generalizability of Machine Learning Models.

Almuqhim F, Saeed F

•papers•Sep 2 2025

Harmonizing multisite functional magnetic resonance imaging (fMRI) data is crucial for eliminating site-specific variability that hinders the generalizability of machine learning models. Traditional harmonization techniques, such as ComBat, depend on additive and multiplicative factors, and may struggle to capture the non-linear interactions between scanner hardware, acquisition protocols, and signal variations between different imaging sites. In addition, these statistical techniques require data from all the sites during their model training which may have the unintended consequence of data leakage for ML models trained using this harmonized data. The ML models trained using this harmonized data may result in low reliability and reproducibility when tested on unseen data sets, limiting their applicability for general clinical usage. In this study, we propose Autoencoders (AEs) as an alternative for harmonizing multisite fMRI data. Our designed and developed framework leverages the non-linear representation learning capabilities of AEs to reduce site-specific effects while preserving biologically meaningful features. Our evaluation using Autism Brain Imaging Data Exchange I (ABIDE-I) dataset, containing 1,035 subjects collected from 17 centers demonstrates statistically significant improvements in leave-one-site-out (LOSO) cross-validation evaluations. All AE variants (AE, SAE, TAE, and DAE) significantly outperformed the baseline mode (p < 0.01), with mean accuracy improvements ranging from 3.41% to 5.04%. Our findings demonstrate the potential of AEs to harmonize multisite neuroimaging data effectively enabling robust downstream analyses across various neuroscience applications while reducing data-leakage, and preservation of neurobiological features. Our open-source code is made available at https://github.com/pcdslab/Autoencoder-fMRI-Harmonization .

MRI Reconstruction Neurological Methodology In Silico Academic Lab Open Code Reproducibility

3D MR Neurography of Craniocervical Nerves: Comparing Double-Echo Steady-State and Postcontrast STIR with Deep Learning-Based Reconstruction at 1.5T.

Ensle F, Zecca F, Kerber B, Lohezic M, Wen Y, Kroschke J, Pawlus K, Guggenberger R

•papers•Sep 2 2025

3D MR neurography is a useful diagnostic tool in head and neck disorders, but neurographic imaging remains challenging in this region. Optimal sequences for nerve visualization have not yet been established and may also differ between nerves. While deep learning (DL) reconstruction can enhance nerve depiction, particularly at 1.5T, studies in the head and neck are lacking. The purpose of this study was to compare double echo steady-state (DESS) and postcontrast STIR sequences in DL-reconstructed 3D MR neurography of the extraforaminal cranial and spinal nerves at 1.5T. Eighteen consecutive examinations of 18 patients undergoing head-and-neck MRI at 1.5T were retrospectively included (mean age: 51 ± 14 years, 11 women). 3D DESS and postcontrast 3D STIR sequences were obtained as part of the standard protocol, and reconstructed with a prototype DL algorithm. Two blinded readers qualitatively evaluated visualization of the inferior alveolar, lingual, facial, hypoglossal, greater occipital, lesser occipital, and greater auricular nerves, as well as overall image quality, vascular suppression, and artifacts. Additionally, apparent SNR and contrast-to-noise ratios of the inferior alveolar and greater occipital nerve were measured. Visual ratings and quantitative measurements, respectively, were compared between sequences by using Wilcoxon signed-rank test. DESS demonstrated significantly improved visualization of the lesser occipital nerve, greater auricular nerve, and proximal greater occipital nerve (P < .015). Postcontrast STIR showed significantly enhanced visualization of the lingual nerve, hypoglossal nerve, and distal inferior alveolar nerve (P < .001). The facial nerve, proximal inferior alveolar nerve, and distal greater occipital nerve did not demonstrate significant differences in visualization between sequences (P > .08). There was also no significant difference for overall image quality and artifacts. Postcontrast STIR achieved superior vascular suppression, reaching statistical significance for 1 reader (P = .039). Quantitatively, there was no significant difference between sequences (P > .05). Our findings suggest that 3D DESS generally provides improved visualization of spinal nerves, while postcontrast 3D STIR facilitates enhanced delineation of extraforaminal cranial nerves.

MRI Reconstruction Neurological Retrospective Clinical In Silico

Feasibility of fully automatic assessment of cervical canal stenosis using MRI via deep learning.

Feng X, Zhang Y, Lu M, Ma C, Miao X, Yang J, Lin L, Zhang Y, Zhang K, Zhang N, Kang Y, Luo Y, Cao K

•papers•Sep 1 2025

Currently, there is no fully automated tool available for evaluating the degree of cervical spinal stenosis. The aim of this study was to develop and validate the use of artificial intelligence (AI) algorithms for the assessment of cervical spinal stenosis. In this retrospective multi-center study, cervical spine magnetic resonance imaging (MRI) scans obtained from July 2020 to June 2023 were included. Studies of patients with spinal instrumentation or studies with suboptimal image quality were excluded. Sagittal T2-weighted images were used. The training data from the Fourth People's Hospital of Shanghai (Hos. 1) and Shanghai Changzheng Hospital (Hos. 2) were annotated by two musculoskeletal (MSK) radiologists following Kang's system as the standard reference. First, a convolutional neural network (CNN) was trained to detect the region of interest (ROI), with a second Transformer for classification. The performance of the deep learning (DL) model was assessed on an internal test set from Hos. 2 and an external test set from Shanghai Changhai Hospital (Hos. 3), and compared among six readers. Metrics such as detection precision, interrater agreement, sensitivity (SEN), and specificity (SPE) were calculated. Overall, 795 patients were analyzed (mean age ± standard deviation, 55±14 years; 346 female), with 589 in the training (75%) and validation (25%) sets, 206 in the internal test set, and 95 in the external test set. Four tasks with different clinical application scenarios were trained, and their accuracy (ACC) ranged from 0.8993 to 0.9532. When using a Kang system score of ≥2 as a threshold for diagnosing central cervical canal stenosis in the internal test set, both the algorithm and six readers achieved similar areas under the receiver operating characteristic curve (AUCs) of 0.936 [95% confidence interval (CI): 0.916-0.955], with a SEN of 90.3% and SPE of 93.8%; the AUC of the DL model was 0.931 (95% CI: 0.917-0.946), with a SEN in the external test set of 100%, and a SPE of 86.3%. Correlation analysis comparing the DL method, the six readers, and MRI reports between the reference standard showed a moderate correlation, with R values ranging from 0.589 to 0.668. The DL model produced approximately the same upgrades (9.2%) and downgrades (5.1%) as the six readers. The DL model could fully automatically and reliably assess cervical canal stenosis using MRI scans.

MRI Classification Musculoskeletal Retrospective Clinical In Silico Academic Lab

Prediction of lymphovascular invasion in invasive breast cancer via intratumoral and peritumoral multiparametric magnetic resonance imaging machine learning-based radiomics with Shapley additive explanations interpretability analysis.

Chen S, Zhong Z, Chen Y, Tang W, Fan Y, Sui Y, Hu W, Pan L, Liu S, Kong Q, Guo Y, Liu W

•papers•Sep 1 2025

The use of multiparametric magnetic resonance imaging (MRI) in predicting lymphovascular invasion (LVI) in breast cancer has been well-documented in the literature. However, the majority of the related studies have primarily focused on intratumoral characteristics, overlooking the potential contribution of peritumoral features. The aim of this study was to evaluate the effectiveness of multiparametric MRI in predicting LVI by analyzing both intratumoral and peritumoral radiomics features and to assess the added value of incorporating both regions in LVI prediction. A total of 366 patients underwent preoperative breast MRI from two centers and were divided into training (n=208), validation (n=70), and test (n=88) sets. Imaging features were extracted from intratumoral and peritumoral T2-weighted imaging, diffusion-weighted imaging, and dynamic contrast-enhanced MRI. Five models were developed for predicting LVI status based on logistic regression: the tumor area (TA) model, peritumoral area (PA) model, tumor-plus-peritumoral area (TPA) model, clinical model, and combined model. The combined model was created incorporating the highest radiomics score and clinical factors. Predictive efficacy was evaluated via the receiver operating characteristic (ROC) curve and area under the curve (AUC). The Shapley additive explanation (SHAP) method was used to rank the features and explain the final model. The performance of the TPA model was superior to that of the TA and PA models. A combined model was further developed via multivariable logistic regression, with the TPA radiomics score (radscore), MRI-assessed axillary lymph node (ALN) status, and peritumoral edema (PE) being incorporated. The combined model demonstrated good calibration and discrimination performance across the training, validation, and test datasets, with AUCs of 0.888 [95% confidence interval (CI): 0.841-0.934], 0.856 (95% CI: 0.769-0.943), and 0.853 (95% CI: 0.760-0.946), respectively. Furthermore, we conducted SHAP analysis to evaluate the contributions of TPA radscore, MRI-ALN status, and PE in LVI status prediction. The combined model, incorporating clinical factors and intratumoral and peritumoral radscore, effectively predicts LVI and may potentially aid in tailored treatment planning.

MRI Classification Breast Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Combining curriculum learning and weakly supervised attention for enhanced thyroid nodule assessment in ultrasound imaging.

Keatmanee C, Songsaeng D, Klabwong S, Nakaguro Y, Kunapinun A, Ekpanyapong M, Dailey MN

•papers•Sep 1 2025

The accurate assessment of thyroid nodules, which are increasingly common with age and lifestyle factors, is essential for early malignancy detection. Ultrasound imaging, the primary diagnostic tool for this purpose, holds promise when paired with deep learning. However, challenges persist with small datasets, where conventional data augmentation can introduce noise and obscure essential diagnostic features. To address dataset imbalance and enhance model generalization, this study integrates curriculum learning with a weakly supervised attention network to improve diagnostic accuracy for thyroid nodule classification. This study integrates curriculum learning with attention-guided data augmentation to improve deep learning model performance in classifying thyroid nodules. Using verified datasets from Siriraj Hospital, the model was trained progressively, beginning with simpler images and gradually incorporating more complex cases. This structured learning approach is designed to enhance the model's diagnostic accuracy by refining its ability to distinguish benign from malignant nodules. Among the curriculum learning schemes tested, schematic IV achieved the best results, with a precision of 100% for benign and 70% for malignant nodules, a recall of 82% for benign and 100% for malignant, and F1-scores of 90% and 83%, respectively. This structured approach improved the model's diagnostic sensitivity and robustness. These findings suggest that automated thyroid nodule assessment, supported by curriculum learning, has the potential to complement radiologists in clinical practice, enhancing diagnostic accuracy and aiding in more reliable malignancy detection.

Ultrasound Classification Abdominal Retrospective Clinical In Silico Academic Lab

Deep Learning for Automated 3D Assessment of Rotator Cuff Muscle Atrophy and Fat Infiltration prior to Total Shoulder Arthroplasty.

Levin JM, Satir OB, Hurley ET, Colasanti C, Becce F, Terrier A, Eghbali P, Goetti P, Klifto C, Anakwenze O, Frankle MA, Namdari S, Büchler P

•papers•Sep 1 2025

Rotator cuff muscle pathology affects outcomes following total shoulder arthroplasty, yet current assessment methods lack reliability in quantifying muscle atrophy and fat infiltration. We developed a deep learning-based model for automated segmentation of rotator cuff muscles on computed tomography (CT) and propose a T-score classification of volumetric muscle atrophy. We further characterized distinct atrophy phenotypes, 3D fat infiltration percentage (3DFI%), and anterior-posterior (AP) balance, which were compared between healthy controls, anatomic total shoulder arthroplasty (aTSA), and reverse total shoulder arthroplasty (rTSA) patients. 952 shoulder CT scans were included (762 controls, 103 undergoing aTSA for glenohumeral osteoarthritis, and 87 undergoing rTSA for cuff tear arthropathy. A deep learning model was developed to allow automated segmentation of supraspinatus (SS), subscapularis (SC), infraspinatus (IS) and teres minor (TM). Muscle volumes were normalized to scapula volume, and control muscle volumes were referenced to calculate T-scores for each muscle. T-scores were classified as no atrophy (>-1.0), moderate atrophy (-1 to -2.5), and severe atrophy (<-2.5). 3DFI% was quantified as the proportion of fat within each muscle using Hounsfield unit thresholds. The T-scores, 3DFI%, and AP balance were compared between the three cohorts. The aTSA cohort had significantly greater atrophy in all muscles compared to control (p<0.001), whereas the rTSA cohort had significantly greater atrophy in SS, SC, and IS than aTSA (p<0.001). In the aTSA cohort, the most common phenotype was SSsevere/SCmoderate/IS+TMmoderate, while in the rTSA cohort it was SSsevere/SCmoderate/IS+TMsevere. The aTSA group had significantly higher 3DFI% compared to controls for all muscles (p<0.001), while the rTSA cohort had significantly higher 3DFI% than aTSA and control cohorts for all muscles (p<0.001). Additionally, the aTSA cohort had a significantly lower AP muscle volume ratio (1.06 vs. 1.14, p<0.001), whereas the rTSA group had a significantly higher AP muscle volume ratio than the control cohort (1.31 vs. 1.14, p<0.001). Our study demonstrates successful development of a deep learning model for automated volumetric assessment of rotator cuff muscle atrophy, 3DFI% and AP balance on shoulder CT scans. We found that aTSA patients had significantly greater muscle atrophy and 3DFI% than controls, while the rTSA patients had the most severe muscle atrophy and 3DFI%. Additionally, distinct phenotypes of muscle atrophy and AP muscle balance exist in aTSA and rTSA that warrant further investigation with regards to shoulder arthroplasty outcomes.

CT Segmentation Musculoskeletal Methodology In Silico Academic Lab

Artificial intelligence-enhanced ultrasound imaging for thyroid nodule detection and malignancy classification: a study on YOLOv11.

Yang J, Luo Z, Wen Y, Zhang J

•papers•Sep 1 2025

Thyroid nodules are a common clinical concern, with accurate diagnosis being critical for effective treatment and improved patient outcomes. Traditional ultrasound examinations rely heavily on the physician's experience, which can lead to diagnostic variability. The integration of artificial intelligence (AI) into medical imaging offers a promising solution for enhancing diagnostic accuracy and efficiency. This study aimed to evaluate the effectiveness of the You Only Look Once v. 11 (YOLOv11) model in detecting and classifying thyroid nodules through ultrasound images, with the goal of supporting real-time clinical decision-making and improving diagnostic workflows. We used the YOLOv11 model to analyze a dataset of 1,503 thyroid ultrasound images, divided into training (1,203 images), validation (150 images), and test (150 images) sets, comprising 742 benign and 778 malignant nodules. Advanced data augmentation and transfer learning techniques were applied to optimize model performance. Comparative analysis was conducted with other YOLO variants (YOLOv3 to YOLOv10) and residual network 50 (ResNet50) to assess their diagnostic capabilities. The YOLOv11 model exhibited superior performance in thyroid nodule detection as compared to other YOLO variants (from YOLOv3 to YOLOv10) and ResNet50. At an intersection over union (IoU) of 0.5, YOLOv11 achieved a precision (P) of 0.841 and recall (R) of 0.823, outperforming ResNet50's P of 0.8333 and R of 0.8025. Among the YOLO variants, YOLOv11 consistently achieved the highest P and R values. For benign nodules, YOLOv11 obtained a P of 0.835 and R of 0.833, while for malignant nodules, it reached a P of 0.846 and a R of 0.813. Within the YOLOv11 model itself, performance varied across different IoU thresholds (0.25, 0.5, 0.7, and 0.9). Lower IoU thresholds generally resulted in better performance metrics, with P and R values decreasing as the IoU threshold increased. YOLOv11 proved to be a powerful tool for thyroid nodule detection and malignancy classification, offering high P and real-time performance. These attributes are vital for dynamic ultrasound examinations and enhancing diagnostic efficiency. Future research will focus on expanding datasets and validating the model's clinical utility in real-time settings.

Ultrasound Detection Abdominal Retrospective Clinical In Silico Academic Lab

Identifying Pathogenesis of Acute Coronary Syndromes using Sequence Contrastive Learning in Coronary Angiography.

Ma X, Shibata Y, Kurihara O, Kobayashi N, Takano M, Kurihara T

•papers•Sep 1 2025

Advances in intracoronary imaging have made it possible to distinguish different pathological mechanisms underlying acute coronary syndrome (ACS) in vivo. Accurate identification of these mechanisms is increasingly recognized as essential for enabling tailored therapeutic strategies. ACS pathogenesis is primarily classified into 2 major types: plaque rupture (PR) and plaque erosion (PE). Patients with PR are treated with intracoronary stenting, whereas those with PE may be potentially managed conservatively without stenting. The aim of this study is to develop neural networks capable of distinguishing PR from PE solely using coronary angiography (CAG). A total of 842 videos from 278 ACS patients (PR:172, PE:106) were included. To ensure the reliability of the ground truth for PR/PE classification, the ACS pathogenesis for each patient was confirmed using Optical Coherence Tomography (OCT). To enhance the learning of discriminative features across consecutive frames and improve PR/PE classification performance, we propose Sequence Contrastive Learning (SeqCon), which addresses the limitations inherent in conventional contrastive learning approaches. In the experiments, the external test set consisted of 18 PR patients (46 videos) and 11 PE patients (30 videos). SeqCon achieved an accuracy of 82.8%, sensitivity of 88.9%, specificity of 72.3%, positive predictive value of 84.2%, and negative predictive value of 80.0% at the patient-level. This is the first report to use contrastive learning for diagnosing the underlying mechanism of ACS by CAG. We demonstrated that it can be feasible to distinguish between PR and PE without intracoronary imaging modalities.

X-Ray Classification Cardiac Retrospective Clinical In Silico Academic Lab

Deep learning-based automated assessment of hepatic fibrosis via magnetic resonance images and nonimage data.

Li W, Zhu Y, Zhao G, Chen X, Zhao X, Xu H, Che Y, Chen Y, Ye Y, Dou X, Wang H, Cheng J, Xie Q, Chen K

•papers•Sep 1 2025

Accurate staging of hepatic fibrosis is critical for prognostication and management among patients with chronic liver disease, and noninvasive, efficient alternatives to biopsy are urgently needed. This study aimed to evaluate the performance of an automated deep learning (DL) algorithm for fibrosis staging and for differentiating patients with hepatic fibrosis from healthy individuals via magnetic resonance (MR) images with and without additional clinical data. A total of 500 patients from two medical centers were retrospectively analyzed. DL models were developed based on delayed-phase MR images to predict fibrosis stages. Additional models were constructed by integrating the DL algorithm with nonimaging variables, including serologic biomarkers [aminotransferase-to-platelet ratio index (APRI) and fibrosis index based on four factors (FIB-4)], viral status (hepatitis B and C), and MR scanner parameters. Diagnostic performance, was assessed via the area under the receiver operating characteristic curve (AUROC), and comparisons were through use of the DeLong test. Sensitivity and specificity of the DL and full models (DL plus all clinical features) were compared with those of experienced radiologists and serologic biomarkers via the McNemar test. In the test set, the full model achieved AUROC values of 0.99 [95% confidence interval (CI): 0.94-1.00], 0.98 (95% CI: 0.93-0.99), 0.90 (95% CI: 0.83-0.95), 0.81 (95% CI: 0.73-0.88), and 0.84 (95% CI: 0.76-0.90) for staging F0-4, F1-4, F2-4, F3-4, and F4, respectively. This model significantly outperformed the DL model in early-stage classification (F0-4 and F1-4). Compared with expert radiologists, it showed superior specificity for F0-4 and higher sensitivity across the other four classification tasks. Both the DL and full models showed significantly greater specificity than did the biomarkers for staging advanced fibrosis (F3-4 and F4). The proposed DL algorithm provides a noninvasive method for hepatic fibrosis staging and screening, outperforming both radiologists and conventional biomarkers, and may facilitate improved clinical decision-making.

MRI Classification Abdominal Retrospective Clinical In Silico Academic Lab

Filter Papers

Tags

RegGAN-based contrast-free CT enhances esophageal cancer assessment: multicenter validation of automated tumor segmentation and T-staging.

Overcoming Site Variability in Multisite fMRI Studies: an Autoencoder Framework for Enhanced Generalizability of Machine Learning Models.

3D MR Neurography of Craniocervical Nerves: Comparing Double-Echo Steady-State and Postcontrast STIR with Deep Learning-Based Reconstruction at 1.5T.

Feasibility of fully automatic assessment of cervical canal stenosis using MRI via deep learning.

Prediction of lymphovascular invasion in invasive breast cancer via intratumoral and peritumoral multiparametric magnetic resonance imaging machine learning-based radiomics with Shapley additive explanations interpretability analysis.

Combining curriculum learning and weakly supervised attention for enhanced thyroid nodule assessment in ultrasound imaging.

Deep Learning for Automated 3D Assessment of Rotator Cuff Muscle Atrophy and Fat Infiltration prior to Total Shoulder Arthroplasty.

Artificial intelligence-enhanced ultrasound imaging for thyroid nodule detection and malignancy classification: a study on YOLOv11.

Identifying Pathogenesis of Acute Coronary Syndromes using Sequence Contrastive Learning in Coronary Angiography.

Deep learning-based automated assessment of hepatic fibrosis via magnetic resonance images and nonimage data.

Ready to Sharpen Your Edge?