Latest Papers on Radiology AI. Tags: CT

Deep learning model for automated segmentation of sphenoid sinus and middle skull base structures in CBCT volumes using nnU-Net v2.

Gülşen İT, Kuran A, Evli C, Baydar O, Dinç Başar K, Bilgir E, Çelik Ö, Bayrakdar İŞ, Orhan K, Acu B

•papers•Aug 1 2025

The purpose of this study is the development of a deep learning model based on nnU-Net v2 for the automated segmentation of sphenoid sinus and middle skull base anatomic structures in cone-beam computed tomography (CBCT) volumes, followed by an evaluation of the model's performance. In this retrospective study, the sphenoid sinus and surrounding anatomical structures in 99 CBCT scans were annotated using web-based labeling software. Model training was conducted using the nnU-Net v2 deep learning model with a learning rate of 0.01 for 1000 epochs. The performance of the model in automatically segmenting these anatomical structures in CBCT scans was evaluated using a series of metrics, including accuracy, precision, recall, dice coefficient (DC), 95% Hausdorff distance (95% HD), intersection on union (IoU), and AUC. The developed deep learning model demonstrated a high level of success in segmenting sphenoid sinus, foramen rotundum, and Vidian canal. Upon evaluation of the DC values, it was observed that the model demonstrated the highest degree of ability to segment the sphenoid sinus, with a DC value of 0.96. The nnU-Net v2-based deep learning model achieved high segmentation performance for the sphenoid sinus, foramen rotundum, and Vidian canal within the middle skull base, with the highest DC observed for the sphenoid sinus (DC: 0.96). However, the model demonstrated limited performance in segmenting other foramina of the middle skull base, indicating the need for further optimization for these structures.

CT Segmentation Neurological Retrospective Clinical In Silico Academic Lab

Validation of a deep learning model for computed tomography-derived fractional flow reserve.

Pham V, Picard F, Monnot S, Garrigoux P, Jegou A, Dambrin G, Paul JF

•papers•Aug 1 2025

CT Classification Cardiac Retrospective Clinical In Silico Academic Lab

Interpreting convolutional neural network explainability for head-and-neck cancer radiotherapy organ-at-risk segmentation

Strijbis, V. I. J., Gurney-Champion, O. J., Grama, D. I., Slotman, B. J., Verbakel, W. F. A. R.

•preprint•Jul 31 2025

BackgroundConvolutional neural networks (CNNs) have emerged to reduce clinical resources and standardize auto-contouring of organs-at-risk (OARs). Although CNNs perform adequately for most patients, understanding when the CNN might fail is critical for effective and safe clinical deployment. However, the limitations of CNNs are poorly understood because of their black-box nature. Explainable artificial intelligence (XAI) can expose CNNs inner mechanisms for classification. Here, we investigate the inner mechanisms of CNNs for segmentation and explore a novel, computational approach to a-priori flag potentially insufficient parotid gland (PG) contours. MethodsFirst, 3D UNets were trained in three PG segmentation situations using (1) synthetic cases; (2) 1925 clinical computed tomography (CT) scans with typical and (3) more consistent contours curated through a previously validated auto-curation step. Then, we generated attribution maps for seven XAI methods, and qualitatively assessed them for congruency between simulated and clinical contours, and how much XAI agreed with expert reasoning. To objectify observations, we explored persistent homology intensity filtrations to capture essential topological characteristics of XAI attributions. Principal component (PC) eigenvalues of Euler characteristic profiles were correlated with spatial agreement (Dice-Sorensen similarity coefficient; DSC). Evaluation was done using sensitivity, specificity and the area under receiver operating characteristic (AUROC) curve on an external AAPM dataset, where as proof-of-principle, we regard the lowest 15% DSC as insufficient. ResultsPatternNet attributions (PNet-A) focused on soft-tissue structures, whereas guided backpropagation (GBP) highlighted both soft-tissue and high-density structures (e.g. mandible bone), which was congruent with synthetic situations. Both methods typically had higher/denser activations in better auto-contoured medial and anterior lobes. Curated models produced "cleaner" gradient class-activation mapping (GCAM) attributions. Quantitative analysis showed that PC{lambda}1 of guided GCAMs (GGCAM) Euler characteristic (EC) profile had good predictive value (sensitivity>0.85, specificity>0.9) of DSC for AAPM cases, with AUROC=0.66, 0.74, 0.94, 0.83 for GBP, GCAM, GGCAM and PNet-A. For for {lambda}1<-1.8e3 of GGCAMs EC-profile, 87% of cases were insufficient. ConclusionsGBP and PNet-A qualitatively agreed most with expert reasoning on directly (structure borders) and indirectly (proxies used for identifying structure borders) important features for PG segmentation. Additionally, this work investigated as proof-of-principle how topological data analysis could possibly be used for quantitative XAI signal analysis to a-priori mark potentially inadequate CNN-segmentations, using only features from inside the predicted PG. This work used PG as a well-understood segmentation paradigm and may extend to target volumes and other organs-at-risk.

CT Segmentation Neurological Methodology In Silico Academic Lab Ethics

Advanced multi-label brain hemorrhage segmentation using an attention-based residual U-Net model.

Lin X, Zou E, Chen W, Chen X, Lin L

•papers•Jul 31 2025

This study aimed to develop and assess an advanced Attention-Based Residual U-Net (ResUNet) model for accurately segmenting different types of brain hemorrhages from CT images. The goal was to overcome the limitations of manual segmentation and current automated methods regarding precision and generalizability. A dataset of 1,347 patient CT scans was collected retrospectively, covering six types of hemorrhages: subarachnoid hemorrhage (SAH, 231 cases), subdural hematoma (SDH, 198 cases), epidural hematoma (EDH, 236 cases), cerebral contusion (CC, 230 cases), intraventricular hemorrhage (IVH, 188 cases), and intracerebral hemorrhage (ICH, 264 cases). The dataset was divided into 80% for training using a 10-fold cross-validation approach and 20% for testing. All CT scans were standardized to a common anatomical space, and intensity normalization was applied for uniformity. The ResUNet model included attention mechanisms to enhance focus on important features and residual connections to support stable learning and efficient gradient flow. Model performance was assessed using the Dice Similarity Coefficient (DSC), Intersection over Union (IoU), and directed Hausdorff distance (dHD). The ResUNet model showed excellent performance during both training and testing. On training data, the model achieved DSC scores of 95 ± 1.2 for SAH, 94 ± 1.4 for SDH, 93 ± 1.5 for EDH, 91 ± 1.4 for CC, 89 ± 1.6 for IVH, and 93 ± 2.4 for ICH. IoU values ranged from 88 to 93, with dHD between 2.1- and 2.7-mm. Testing results confirmed strong generalization, with DSC scores of 93 for SAH, 93 for SDH, 92 for EDH, 90 for CC, 88 for IVH, and 92 for ICH. IoU values were also high, indicating precise segmentation and minimal boundary errors. The ResUNet model outperformed standard U-Net variants, achieving higher multi-label segmentation accuracy. This makes it a valuable tool for clinical applications that require fast and reliable brain hemorrhage analysis. Future research could investigate semi-supervised techniques and 3D segmentation to further enhance clinical use. Not applicable.

CT Segmentation Neurological Retrospective Clinical In Silico

An interpretable CT-based machine learning model for predicting recurrence risk in stage II colorectal cancer.

Wu Z, Gong L, Luo J, Chen X, Yang F, Wen J, Hao Y, Wang Z, Gu R, Zhang Y, Liao H, Wen G

•papers•Jul 31 2025

This study aimed to develop an interpretable 3-year disease-free survival risk prediction tool to stratify patients with stage II colorectal cancer (CRC) by integrating CT images and clinicopathological factors. A total of 769 patients with pathologically confirmed stage II CRC and disease-free survival (DFS) follow-up information were recruited from three medical centers and divided into training (n = 442), test (n = 190), and validation cohorts (n = 137). CT-based tumor radiomics features were extracted, selected, and used to calculate a Radscore. A combined model was developed using artificial neural network (ANN) algorithm, by integrating the Radscore with significant clinicoradiological factors to classify patients into high- and low-risk groups. Model performance was assessed using the area under the curve (AUC), and feature contributions were qualified using the Shapley additive explanation (SHAP) algorithm. Kaplan-Meier survival analysis revealed the prognostic stratification value of the risk groups. Fourteen radiomics features and five clinicoradiological factors were selected to construct the radiomics and clinicoradiological models, respectively. The combined model demonstrated optimal performance, with AUCs of 0.811 and 0.846 in the test and validation cohorts, respectively. Kaplan-Meier curves confirmed effective patient stratification (p < 0.001) in both test and validation cohorts. A high Radscore, rough intestinal outer edge, and advanced age were identified as key prognostic risk factors using the SHAP. The combined model effectively stratified patients with stage II CRC into different prognostic risk groups, aiding clinical decision-making. Integrating CT images with clinicopathological information can facilitate the identification of patients with stage II CRC who are most likely to benefit from adjuvant chemotherapy. The effectiveness of adjuvant chemotherapy for stage II colorectal cancer remains debated. A combined model successfully identified high-risk stage II colorectal cancer patients. Shapley additive explanations enhance the interpretability of the model's predictions.

CT Classification Abdominal Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Prognostication in patients with idiopathic pulmonary fibrosis using quantitative airway analysis from HRCT: a retrospective study.

Nan Y, Federico FN, Humphries S, Mackintosh JA, Grainge C, Jo HE, Goh N, Reynolds PN, Hopkins PMA, Navaratnam V, Moodley Y, Walters H, Ellis S, Keir G, Zappala C, Corte T, Glaspole I, Wells AU, Yang G, Walsh SL

•papers•Jul 31 2025

Predicting shorter life expectancy is crucial for prioritizing antifibrotic therapy in fibrotic lung diseases, where progression varies widely, from stability to rapid deterioration. This heterogeneity complicates treatment decisions, emphasizing the need for reliable baseline measures. This study focuses on leveraging artificial intelligence model to address heterogeneity in disease outcomes, focusing on mortality as the ultimate measure of disease trajectory. This retrospective study included 1744 anonymised patients who underwent high-resolution CT scanning. The AI model, SABRE (Smart Airway Biomarker Recognition Engine), was developed using data from patients with various lung diseases (n=460, including lung cancer, pneumonia, emphysema, and fibrosis). Then, 1284 high-resolution CT scans with evidence of diffuse FLD from the Australian IPF Registry and OSIC were used for clinical analyses. Airway branches were categorized and quantified by anatomic structures and volumes, followed by multivariable analysis to explore the associations between these categories and patients' progression and mortality, adjusting for disease severity or traditional measurements. Cox regression identified SABRE-based variables as independent predictors of mortality and progression, even adjusting for disease severity (fibrosis extent, traction bronchiectasis extent, and ILD extent), traditional measures (FVC%, DLCO%, and CPI), and previously reported deep learning algorithms for fibrosis quantification and morphological analysis. Combining SABRE with DLCO significantly improved prognosis utility, yielding an AUC of 0.852 at the first year and a C-index of 0.752. SABRE-based variables capture prognostic signals beyond that provided by traditional measurements, disease severity scores, and established AI-based methods, reflecting the progressiveness and pathogenesis of the disease.

CT Classification Chest Retrospective Clinical In Silico Academic Lab Benchmark SOTA

External Validation of a Winning Artificial Intelligence Algorithm from the RSNA 2022 Cervical Spine Fracture Detection Challenge.

Harper JP, Lee GR, Pan I, Nguyen XV, Quails N, Prevedello LM

•papers•Jul 31 2025

The Radiological Society of North America has actively promoted artificial intelligence (AI) challenges since 2017. Algorithms emerging from the recent RSNA 2022 Cervical Spine Fracture Detection Challenge demonstrated state-of-the-art performance in the competition's data set, surpassing results from prior publications. However, their performance in real-world clinical practice is not known. As an initial step toward the goal of assessing feasibility of these models in clinical practice, we conducted a generalizability test by using one of the leading algorithms of the competition. The deep learning algorithm was selected due to its performance, portability, and ease of use, and installed locally. One hundred examinations (50 consecutive cervical spine CT scans with at least 1 fracture present and 50 consecutive negative CT scans) from a level 1 trauma center not represented in the competition data set were processed at 6.4 seconds per examination. Ground truth was established based on the radiology report with retrospective confirmation of positive fracture cases. Sensitivity, specificity, F1 score, and area under the curve were calculated. The external validation data set comprised older patients in comparison to the competition set (53.5 ± 21.8 years versus 58 ± 22.0, respectively; <i>P</i> < .05). Sensitivity and specificity were 86% and 70% in the external validation group and 85% and 94% in the competition group, respectively. Fractures misclassified by the convolutional neural networks frequently had features of advanced degenerative disease, subtle nondisplaced fractures not easily identified on the axial plane, and malalignment. The model performed with a similar sensitivity on the test and external data set, suggesting that such a tool could be potentially generalizable as a triage tool in the emergency setting. Discordant factors such as age-associated comorbidities may affect accuracy and specificity of AI models when used in certain populations. Further research should be encouraged to help elucidate the potential contributions and pitfalls of these algorithms in supporting clinical care.

CT Detection Musculoskeletal Retrospective Clinical In Silico Consortium Benchmark SOTA

Effect of spatial resolution on the diagnostic performance of machine-learning radiomics model in lung adenocarcinoma: comparisons between normal- and high-spatial-resolution imaging for predicting invasiveness.

Yanagawa M, Nagatani Y, Hata A, Sumikawa H, Moriya H, Iwano S, Tsuchiya N, Iwasawa T, Ohno Y, Tomiyama N

•papers•Jul 31 2025

To construct two machine learning radiomics (MLR) for invasive adenocarcinoma (IVA) prediction using normal-spatial-resolution (NSR) and high-spatial-resolution (HSR) training cohorts, and to validate models (model-NSR and -HSR) in another test cohort while comparing independent radiologists' (R1, R2) performance with and without model-HSR. In this retrospective multicenter study, all CT images were reconstructed using NSR data (512 matrix, 0.5-mm thickness) and HSR data (2048 matrix, 0.25-mm thickness). Nodules were divided into training (n = 61 non-IVA, n = 165 IVA) and test sets (n = 36 non-IVA, n = 203 IVA). Two MLR models were developed with 18 significant factors for the NSR model and 19 significant factors for the HSR model from 172 radiomics features using random forest. Area under the receiver operator characteristic curves (AUC) was analyzed using DeLong's test in the test set. Accuracy (acc), sensitivity (sen), and specificity (spc) of R1 and R2 with and without model-HSR were compared using McNemar test. 437 patients (70 ± 9 years, 203 men) had 465 nodules (n = 368, IVA). Model-HSR AUCs were significantly higher than model-NSR in training (0.839 vs. 0.723) and test sets (0.863 vs. 0.718) (p < 0.05). R1's acc (87.2%) and sen (93.1%) with model-HSR were significantly higher than without (77.0% and 79.3%) (p < 0.0001). R2's acc (83.7%) and sen (86.7%) with model-HSR might be equal or higher than without (83.7% and 85.7%, respectively), but not significant (p > 0.50). Spc of R1 (52.8%) and R2 (66.7%) with model-HSR might be lower than without (63.9% and 72.2%, respectively), but not significant (p > 0.21). HSR-based MLR model significantly increased IVA diagnostic performance compared to NSR, supporting radiologists without compromising accuracy and sensitivity. However, this benefit came at the cost of reduced specificity, potentially increasing false positives, which may lead to unnecessary examinations or overtreatment in clinical settings.

CT Classification Chest Retrospective Clinical In Silico Academic Lab

Validating an explainable radiomics approach in non-small cell lung cancer combining high energy physics with clinical and biological analyses.

Monteleone M, Camagni F, Percio S, Morelli L, Baroni G, Gennai S, Govoni P, Paganelli C

•papers•Jul 30 2025

This study aims at establishing a validation framework for an explainable radiomics-based model, specifically targeting classification of histopathological subtypes in non-small cell lung cancer (NSCLC) patients. We developed an explainable radiomics pipeline using open-access CT images from the cancer imaging archive (TCIA). Our approach incorporates three key prongs: SHAP-based feature selection for explainability within the radiomics pipeline, a technical validation of the explainable technique using high energy physics (HEP) data, and a biological validation using RNA-sequencing data and clinical observations. Our radiomic model achieved an accuracy of 0.84 in the classification of the histological subtype. The technical validation performed on the HEP domain over 150 numerically equivalent datasets, maintaining consistent sample size and class imbalance, confirmed the reliability of SHAP-based input features. Biological analysis found significant correlations between gene expression and CT-based radiomic features. In particular, gene MUC21 achieved the highest correlation with the radiomic feature describing the10th percentile of voxel intensities (r = 0.46, p < 0.05). This study presents a validation framework for explainable CT-based radiomics in lung cancer, combining HEP-driven technical validation with biological validation to enhance interpretability, reliability, and clinical relevance of XAI models.

CT Classification Chest Methodology In Silico Ethics Open Dataset

Wall Shear Stress Estimation in Abdominal Aortic Aneurysms: Towards Generalisable Neural Surrogate Models

Patryk Rygiel, Julian Suk, Christoph Brune, Kak Khee Yeung, Jelmer M. Wolterink

•preprint•Jul 30 2025

Abdominal aortic aneurysms (AAAs) are pathologic dilatations of the abdominal aorta posing a high fatality risk upon rupture. Studying AAA progression and rupture risk often involves in-silico blood flow modelling with computational fluid dynamics (CFD) and extraction of hemodynamic factors like time-averaged wall shear stress (TAWSS) or oscillatory shear index (OSI). However, CFD simulations are known to be computationally demanding. Hence, in recent years, geometric deep learning methods, operating directly on 3D shapes, have been proposed as compelling surrogates, estimating hemodynamic parameters in just a few seconds. In this work, we propose a geometric deep learning approach to estimating hemodynamics in AAA patients, and study its generalisability to common factors of real-world variation. We propose an E(3)-equivariant deep learning model utilising novel robust geometrical descriptors and projective geometric algebra. Our model is trained to estimate transient WSS using a dataset of CT scans of 100 AAA patients, from which lumen geometries are extracted and reference CFD simulations with varying boundary conditions are obtained. Results show that the model generalizes well within the distribution, as well as to the external test set. Moreover, the model can accurately estimate hemodynamics across geometry remodelling and changes in boundary conditions. Furthermore, we find that a trained model can be applied to different artery tree topologies, where new and unseen branches are added during inference. Finally, we find that the model is to a large extent agnostic to mesh resolution. These results show the accuracy and generalisation of the proposed model, and highlight its potential to contribute to hemodynamic parameter estimation in clinical practice.

CT Registration Vascular Methodology In Silico Academic Lab Benchmark SOTA

Filter Papers

Tags

Deep learning model for automated segmentation of sphenoid sinus and middle skull base structures in CBCT volumes using nnU-Net v2.

Validation of a deep learning model for computed tomography-derived fractional flow reserve.

Interpreting convolutional neural network explainability for head-and-neck cancer radiotherapy organ-at-risk segmentation

Advanced multi-label brain hemorrhage segmentation using an attention-based residual U-Net model.

An interpretable CT-based machine learning model for predicting recurrence risk in stage II colorectal cancer.

Prognostication in patients with idiopathic pulmonary fibrosis using quantitative airway analysis from HRCT: a retrospective study.

External Validation of a Winning Artificial Intelligence Algorithm from the RSNA 2022 Cervical Spine Fracture Detection Challenge.

Effect of spatial resolution on the diagnostic performance of machine-learning radiomics model in lung adenocarcinoma: comparisons between normal- and high-spatial-resolution imaging for predicting invasiveness.

Validating an explainable radiomics approach in non-small cell lung cancer combining high energy physics with clinical and biological analyses.

Wall Shear Stress Estimation in Abdominal Aortic Aneurysms: Towards Generalisable Neural Surrogate Models

Ready to Sharpen Your Edge?