Latest Papers on Radiology AI. Tags: CT

Multi-class transformer-based segmentation of pancreatic ductal adenocarcinoma and surrounding structures in CT imaging: a multi-center evaluation.

Wen S, Xiao X

•papers•Jun 14 2025

Accurate segmentation of pancreatic ductal adenocarcinoma (PDAC) and surrounding anatomical structures is critical for diagnosis, treatment planning, and outcome assessment. This study proposes a deep learning-based framework to automate multi-class segmentation in CT images, comparing the performance of four state-of-the-art architectures. This retrospective multi-center study included 3265 patients from six institutions. Four deep learning models-UNet, nnU-Net, UNETR, and Swin-UNet-were trained using five-fold cross-validation on data from five centers and tested independently on a sixth center (n = 569). Preprocessing included intensity normalization, voxel resampling, and standardized annotation for six structures: PDAC lesion, pancreas, veins, arteries, pancreatic duct, and common bile duct. Evaluation metrics included Dice Similarity Coefficient (DSC), Intersection over Union (IoU), directed Hausdorff Distance (dHD), Average Symmetric Surface Distance (ASSD), and Volume Overlap Error (VOE). Statistical comparisons were made using Wilcoxon signed-rank tests with Bonferroni correction. Swin-UNet outperformed all models with a mean validation DSC of 92.4% and test DSC of 90.8%, showing minimal overfitting. It also achieved the lowest dHD (4.3 mm), ASSD (1.2 mm), and VOE (6.0%) in cross-validation. Per-class DSCs for Swin-UNet were consistently higher across all anatomical targets, including challenging structures like the pancreatic duct (91.0%) and bile duct (91.8%). Statistical analysis confirmed the superiority of Swin-UNet (p < 0.001). All models showed generalization capability, but Swin-UNet provided the most accurate and robust segmentation across datasets. Transformer-based architectures, particularly Swin-UNet, enable precise and generalizable multi-class segmentation of PDAC and surrounding anatomy. This framework has potential for clinical integration in PDAC diagnosis, staging, and therapy planning.

CT Segmentation Abdominal Retrospective Clinical In Silico Academic Lab

The Machine Learning Models in Major Cardiovascular Adverse Events Prediction Based on Coronary Computed Tomography Angiography: Systematic Review.

Ma Y, Li M, Wu H

•papers•Jun 13 2025

Coronary computed tomography angiography (CCTA) has emerged as the first-line noninvasive imaging test for patients at high risk of coronary artery disease (CAD). When combined with machine learning (ML), it provides more valid evidence in diagnosing major adverse cardiovascular events (MACEs). Radiomics provides informative multidimensional features that can help identify high-risk populations and can improve the diagnostic performance of CCTA. However, its role in predicting MACEs remains highly debated. We evaluated the diagnostic value of ML models constructed using radiomic features extracted from CCTA in predicting MACEs, and compared the performance of different learning algorithms and models, thereby providing clinical recommendations for the diagnosis, treatment, and prognosis of MACEs. We comprehensively searched 5 online databases, Cochrane Library, Web of Science, Elsevier, CNKI, and PubMed, up to September 10, 2024, for original studies that used ML models among patients who underwent CCTA to predict MACEs and reported clinical outcomes and endpoints related to it. Risk of bias in the ML models was assessed by the Prediction Model Risk of Bias Assessment Tool, while the radiomics quality score (RQS) was used to evaluate the methodological quality of the radiomics prediction model development and validation. We also followed the TRIPOD (Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis) guidelines to ensure transparency of ML models included. Meta-analysis was performed using Meta-DiSc software (version 1.4), which included the I² score and Cochran Q test, along with StataMP 17 (StataCorp) to assess heterogeneity and publication bias. Due to the high heterogeneity observed, subgroup analysis was conducted based on different model groups. Ten studies were included in the analysis, 5 (50%) of which differentiated between training and testing groups, where the training set collected 17 kinds of models and the testing set gathered 26 models. The pooled area under the receiver operating characteristic (AUROC) curve for ML models predicting MACEs was 0.7879 in the training set and 0.7981 in the testing set. Logistic regression (LR), the most commonly used algorithm, achieved an AUROC of 0.8229 in the testing group and 0.7983 in the training group. Non-LR models yielded AUROCs of 0.7390 in the testing set and 0.7648 in the training set, while the random forest (RF) models reached an AUROC of 0.8444 in the training group. Study limitations included a limited number of studies, high heterogeneity, and the types of included studies. The performance of ML models for predicting MACEs was found to be superior to that of general models based on basic feature extraction and integration from CCTA. Specifically, LR-based ML diagnostic models demonstrated significant clinical potential, particularly when combined with clinical features, and are worth further validation through more clinical trials. PROSPERO CRD42024596364; https://www.crd.york.ac.uk/PROSPERO/view/CRD42024596364.

CT Classification Cardiac Meta Analysis In Silico Academic Lab Benchmark SOTA

Taming Stable Diffusion for Computed Tomography Blind Super-Resolution

Chunlei Li, Yilei Shi, Haoxi Hu, Jingliang Hu, Xiao Xiang Zhu, Lichao Mou

•preprint•Jun 13 2025

High-resolution computed tomography (CT) imaging is essential for medical diagnosis but requires increased radiation exposure, creating a critical trade-off between image quality and patient safety. While deep learning methods have shown promise in CT super-resolution, they face challenges with complex degradations and limited medical training data. Meanwhile, large-scale pre-trained diffusion models, particularly Stable Diffusion, have demonstrated remarkable capabilities in synthesizing fine details across various vision tasks. Motivated by this, we propose a novel framework that adapts Stable Diffusion for CT blind super-resolution. We employ a practical degradation model to synthesize realistic low-quality images and leverage a pre-trained vision-language model to generate corresponding descriptions. Subsequently, we perform super-resolution using Stable Diffusion with a specialized controlling strategy, conditioned on both low-resolution inputs and the generated text descriptions. Extensive experiments show that our method outperforms existing approaches, demonstrating its potential for achieving high-quality CT imaging at reduced radiation doses. Our code will be made publicly available.

CT Reconstruction Methodology In Silico Academic Lab Open Code GenAI

CEREBLEED: Automated quantification and severity scoring of intracranial hemorrhage on non-contrast CT

Cepeda, S., Esteban-Sinovas, O., Arrese, I., Sarabia, R.

•preprint•Jun 13 2025

BackgroundIntracranial hemorrhage (ICH), whether spontaneous or traumatic, is a neurological emergency with high morbidity and mortality. Accurate assessment of severity is essential for neurosurgical decision-making. This study aimed to develop and evaluate a fully automated, deep learning-based tool for the standardized assessment of ICH severity, based on the segmentation of the hemorrhage and intracranial structures, and the computation of an objective severity index. MethodsNon-contrast cranial CT scans from patients with spontaneous or traumatic ICH were retrospectively collected from public datasets and a tertiary care center. Deep learning models were trained to segment hemorrhages and intracranial structures. These segmentations were used to compute a severity index reflecting bleeding burden and mass effect through volumetric relationships. Segmentation performance was evaluated on a hold-out test cohort. In a prospective cohort, the severity index was assessed in relation to expert-rated CT severity, clinical outcomes, and the need for urgent neurosurgical intervention. ResultsA total of 1,110 non-contrast cranial CT scans were analyzed, 900 from the retrospective cohort and 200 from the prospective evaluation cohort. The binary segmentation model achieved a median Dice score of 0.90 for total hemorrhage. The multilabel model yielded Dice scores ranging from 0.55 to 0.94 across hemorrhage subtypes. The severity index significantly correlated with expert-rated CT severity (p < 0.001), the modified Rankin Scale (p = 0.007), and the Glasgow Outcome Scale-Extended (p = 0.039), and independently predicted the need for urgent surgery (p < 0.001). A threshold [~]300 was identified as a decision point for surgical management (AUC = 0.83). ConclusionWe developed a fully automated and openly accessible pipeline for the analysis of non-contrast cranial CT in intracranial hemorrhage. It computes a novel index that objectively quantifies hemorrhage severity and is significantly associated with clinically relevant outcomes, including the need for urgent neurosurgical intervention.

CT Segmentation Neurological Retrospective Clinical Clinical Pilot Academic Lab Open Code

Beyond Benchmarks: Towards Robust Artificial Intelligence Bone Segmentation in Socio-Technical Systems

Xie, K., Gruber, L. J., Crampen, M., Li, Y., Ferreira, A., Tappeiner, E., Gillot, M., Schepers, J., Xu, J., Pankert, T., Beyer, M., Shahamiri, N., ten Brink, R., Dot, G., Weschke, C., van Nistelrooij, N., Verhelst, P.-J., Guo, Y., Xu, Z., Bienzeisler, J., Rashad, A., Flügge, T., Cotton, R., Vinayahalingam, S., Ilesan, R., Raith, S., Madsen, D., Seibold, C., Xi, T., Berge, S., Nebelung, S., Kodym, O., Sundqvist, O., Thieringer, F., Lamecker, H., Coppens, A., Potrusil, T., Kraeima, J., Witjes, M., Wu, G., Chen, X., Lambrechts, A., Cevidanes, L. H. S., Zachow, S., Hermans, A., Truhn, D., Alves,

•preprint•Jun 13 2025

Despite the advances in automated medical image segmentation, AI models still underperform in various clinical settings, challenging real-world integration. In this multicenter evaluation, we analyzed 20 state-of-the-art mandibular segmentation models across 19,218 segmentations of 1,000 clinically resampled CT/CBCT scans. We show that segmentation accuracy varies by up to 25% depending on socio-technical factors such as voxel size, bone orientation, and patient conditions such as osteosynthesis or pathology. Higher sharpness, isotropic smaller voxels, and neutral orientation significantly improved results, while metallic osteosynthesis and anatomical complexity led to significant degradation. Our findings challenge the common view of AI models as "plug-and-play" tools and suggest evidence-based optimization recommendations for both clinicians and developers. This will in turn boost the integration of AI segmentation tools in routine healthcare.

CT Segmentation Musculoskeletal Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Investigating the Role of Area Deprivation Index in Observed Differences in CT-Based Body Composition by Race.

Chisholm M, Jabal MS, He H, Wang Y, Kalisz K, Lafata KJ, Calabrese E, Bashir MR, Tailor TD, Magudia K

•papers•Jun 13 2025

Differences in CT-based body composition (BC) have been observed by race. We sought to investigate whether indices reporting census block group-level disadvantage, area deprivation index (ADI) and social vulnerability index (SVI), age, sex, and/or clinical factors could explain race-based differences in body composition. The first abdominal CT exams for patients in Durham County at a single institution in 2020 were analyzed using a fully automated and open-source deep learning BC analysis workflow to generate cross-sectional areas for skeletal muscle (SMA), subcutaneous fat (SFA), and visceral fat (VFA). Patient level demographic and clinical data were gathered from the electronic health record. State ADI ranking and SVI values were linked to each patient. Univariable and multivariable models were created to assess the association of demographics, ADI, SVI, and other relevant clinical factors with SMA, SFA, and VFA. 5,311 patients (mean age, 57.4 years; 55.5% female, 46.5% Black; 39.5% White 10.3% Hispanic) were included. At univariable analysis, race, ADI, SVI, sex, BMI, weight, and height were significantly associated with all body compartments (SMA, SFA, and VFA, all p<0.05). At multivariable analyses adjusted for patient characteristics and clinical comorbidities, race remained a significant predictor, whereas ADI did not. SVI was significant in a multivariable model with SMA.

CT Segmentation Abdominal Retrospective Clinical In Silico Academic Lab Open Code

Quantitative and qualitative assessment of ultra-low-dose paranasal sinus CT using deep learning image reconstruction: a comparison with hybrid iterative reconstruction.

Otgonbaatar C, Lee D, Choi J, Jang H, Shim H, Ryoo I, Jung HN, Suh S

•papers•Jun 13 2025

This study aimed to evaluate the quantitative and qualitative performances of ultra-low-dose computed tomography (CT) with deep learning image reconstruction (DLR) compared with those of hybrid iterative reconstruction (IR) for preoperative paranasal sinus (PNS) imaging. This retrospective analysis included 132 patients who underwent non-contrast ultra-low-dose sinus CT (0.03 mSv). Images were reconstructed using hybrid IR and DLR. Objective image quality metrics, including image noise, signal-to-noise ratio (SNR), contrast-to-noise ratio (CNR), noise power spectrum (NPS), and no-reference perceptual image sharpness, were assessed. Two board-certified radiologists independently performed subjective image quality evaluations. The ultra-low-dose CT protocol achieved a low radiation dose (effective dose: 0.03 mSv). DLR showed significantly lower image noise (28.62 ± 4.83 Hounsfield units) compared to hybrid IR (140.70 ± 16.04, p < 0.001), with DLR yielding smoother and more uniform images. DLR demonstrated significantly improved SNR (22.47 ± 5.82 vs 9.14 ± 2.45, p < 0.001) and CNR (71.88 ± 14.03 vs 11.81 ± 1.50, p < 0.001). NPS analysis revealed that DLR reduced the noise magnitude and NPS peak values. Additionally, DLR demonstrated significantly sharper images (no-reference perceptual sharpness metric: 0.56 ± 0.04) compared to hybrid IR (0.36 ± 0.01). Radiologists rated DLR as superior in overall image quality, bone structure visualization, and diagnostic confidence compared to hybrid IR at ultra-low-dose CT. DLR significantly outperformed hybrid IR in ultra-low-dose PNS CT by reducing image noise, improving SNR and CNR, enhancing image sharpness, and maintaining critical anatomical visualization, demonstrating its potential for effective preoperative planning with minimal radiation exposure. Question Ultra-low-dose CT for paranasal sinuses is essential for patients requiring repeated scans and functional endoscopic sinus surgery (FESS) planning to reduce cumulative radiation exposure. Findings DLR outperformed hybrid IR in ultra-low-dose paranasal sinus CT. Clinical relevance Ultra-low-dose CT with DLR delivers sufficient image quality for detailed surgical planning, effectively minimizing unnecessary radiation exposure to enhance patient safety.

CT Reconstruction Retrospective Clinical Clinical Pilot Academic Lab Benchmark SOTA

Long-term prognostic value of the CT-derived fractional flow reserve combined with atherosclerotic burden in patients with non-obstructive coronary artery disease.

Wang Z, Li Z, Xu T, Wang M, Xu L, Zeng Y

•papers•Jun 13 2025

The long-term prognostic significance of the coronary computed tomography angiography (CCTA)-derived fractional flow reserve (CT-FFR) for non-obstructive coronary artery disease (CAD) is uncertain. We aimed to investigate the additional prognostic value of CT-FFR beyond CCTA-defined atherosclerotic burden for long-term outcomes. Consecutive patients with suspected stable CAD were candidates for this retrospective cohort study. Deep-learning-based vessel-specific CT-FFR was calculated. All patients enrolled were followed for at least 5 years. The primary outcome was major adverse cardiovascular events (MACE). Predictive abilities for MACE were compared among three models (model 1, constructed using clinical variables; model 2, model 1 + CCTA-derived atherosclerotic burden (Leiden risk score and segment involvement score); and model 3, model 2 + CT-FFR). A total of 1944 patients (median age, 59 (53-65) years; 53.0% men) were included. During a median follow-up time of 73.4 (71.2-79.7) months, 64 patients (3.3%) experienced MACE. In multivariate-adjusted Cox models, CT-FFR ≤ 0.80 (HR: 7.18; 95% CI: 4.25-12.12; p < 0.001) was a robust and independent predictor for MACE. The discriminant ability was higher in model 2 than in model 1 (C-index, 0.76 vs. 0.68; p = 0.001) and was further promoted by adding CT-FFR to model 3 (C-index, 0.83 vs. 0.76; p < 0.001). Integrated discrimination improvement (IDI) was 0.033 (p = 0.022) for model 2 beyond model 1. Of note, compared with model 2, model 3 also exhibited improved discrimination (IDI = 0.056; p < 0.001). In patients with non-obstructive CAD, CT-FFR provides robust and incremental prognostic information for predicting long-term outcomes. The combined model including CT-FFR and CCTA-defined atherosclerotic burden exhibits improved prediction abilities, which is helpful for risk stratification. Question Prognostic significance of the CT-fractional flow reserve (FFR) in non-obstructive coronary artery disease for long-term outcomes merits further investigation. Findings Our data strongly emphasized the independent and additional predictive value of CT-FFR beyond coronary CTA-defined atherosclerotic burden and clinical risk factors. Clinical relevance The new combined predictive model incorporating CT-FFR can be satisfactorily used for risk stratification of patients with non-obstructive coronary artery disease by identifying those who are truly suitable for subsequent high-intensity preventative therapies and extensive follow-up for prognostic reasons.

CT Classification Cardiac Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Prediction of functional outcome after traumatic brain injury: a narrative review.

Iaquaniello C, Scordo E, Robba C

•papers•Jun 13 2025

To synthesize current evidence on prognostic factors, tools, and strategies influencing functional outcomes in patients with traumatic brain injury (TBI), with a focus on the acute and postacute phases of care. Key early predictors such as Glasgow Coma Scale (GCS) scores, pupillary reactivity, and computed tomography (CT) imaging findings remain fundamental in guiding clinical decision-making. Prognostic models like IMPACT and CRASH enhance early risk stratification, while outcome measures such as the Glasgow Outcome Scale-Extended (GOS-E) provide structured long-term assessments. Despite their utility, heterogeneity in assessment approaches and treatment protocols continues to limit consistency in outcome predictions. Recent advancements highlight the value of fluid biomarkers like neurofilament light chain (NFL) and glial fibrillary acidic protein (GFAP), which offer promising avenues for improved accuracy. Additionally, artificial intelligence models are emerging as powerful tools to integrate complex datasets and refine individualized outcome forecasting. Neurological prognostication after TBI is evolving through the integration of clinical, radiological, molecular, and computational data. Although standardized models and scales remain foundational, emerging technologies and therapies - such as biomarkers, machine learning, and neurostimulants - represent a shift toward more personalized and actionable strategies to optimize recovery and long-term function.

CT Classification Neurological Review Concept Academic Lab GenAI

Impact of Deep Learning-Based Image Conversion on Fully Automated Coronary Artery Calcium Scoring Using Thin-Slice, Sharp-Kernel, Non-Gated, Low-Dose Chest CT Scans: A Multi-Center Study.

Kim C, Hong S, Choi H, Yoo WS, Kim JY, Chang S, Park CH, Hong SJ, Yang DH, Yong HS, van Assen M, De Cecco CN, Suh YJ

•papers•Jun 13 2025

To evaluate the impact of deep learning-based image conversion on the accuracy of automated coronary artery calcium quantification using thin-slice, sharp-kernel, non-gated, low-dose chest computed tomography (LDCT) images collected from multiple institutions. A total of 225 pairs of LDCT and calcium scoring CT (CSCT) images scanned at 120 kVp and acquired from the same patient within a 6-month interval were retrospectively collected from four institutions. Image conversion was performed for LDCT images using proprietary software programs to simulate conventional CSCT. This process included 1) deep learning-based kernel conversion of low-dose, high-frequency, sharp kernels to simulate standard-dose, low-frequency kernels, and 2) thickness conversion using the raysum method to convert 1-mm or 1.25-mm thickness images to 3-mm thickness. Automated Agaston scoring was conducted on the LDCT scans before (LDCT-Orgauto) and after the image conversion (LDCT-CONVauto). Manual scoring was performed on the CSCT images (CSCTmanual) and used as a reference standard. The accuracy of automated Agaston scores and risk severity categorization based on the automated scoring on LDCT scans was analyzed compared to the reference standard, using the Bland-Altman analysis, concordance correlation coefficient (CCC), and weighted kappa (κ) statistic. LDCT-CONVauto demonstrated a reduced bias for Agaston score, compared with CSCTmanual, than LDCT-Orgauto did (-3.45 vs. 206.7). LDCT-CONVauto showed a higher CCC than LDCT-Orgauto did (0.881 [95% confidence interval {CI}, 0.750-0.960] vs. 0.269 [95% CI, 0.129-0.430]). In terms of risk category assignment, LDCT-Orgauto exhibited poor agreement with CSCTmanual (weighted κ = 0.115 [95% CI, 0.082-0.154]), whereas LDCT-CONVauto achieved good agreement (weighted κ = 0.792 [95% CI, 0.731-0.847]). Deep learning-based conversion of LDCT images originally obtained with thin slices and a sharp kernel can enhance the accuracy of automated coronary artery calcium score measurement using the images.

CT Reconstruction Cardiac Retrospective Clinical In Silico Academic Lab

Filter Papers

Tags

Multi-class transformer-based segmentation of pancreatic ductal adenocarcinoma and surrounding structures in CT imaging: a multi-center evaluation.

The Machine Learning Models in Major Cardiovascular Adverse Events Prediction Based on Coronary Computed Tomography Angiography: Systematic Review.

Taming Stable Diffusion for Computed Tomography Blind Super-Resolution

CEREBLEED: Automated quantification and severity scoring of intracranial hemorrhage on non-contrast CT

Beyond Benchmarks: Towards Robust Artificial Intelligence Bone Segmentation in Socio-Technical Systems

Investigating the Role of Area Deprivation Index in Observed Differences in CT-Based Body Composition by Race.

Quantitative and qualitative assessment of ultra-low-dose paranasal sinus CT using deep learning image reconstruction: a comparison with hybrid iterative reconstruction.

Long-term prognostic value of the CT-derived fractional flow reserve combined with atherosclerotic burden in patients with non-obstructive coronary artery disease.

Prediction of functional outcome after traumatic brain injury: a narrative review.

Impact of Deep Learning-Based Image Conversion on Fully Automated Coronary Artery Calcium Scoring Using Thin-Slice, Sharp-Kernel, Non-Gated, Low-Dose Chest CT Scans: A Multi-Center Study.

Ready to Sharpen Your Edge?