Latest Papers on Radiology AI. Tags: Classification

Multi-modal machine learning classifier for idiopathic pulmonary fibrosis predicts mortality in interstitial lung diseases.

Callahan SJ, Scholand MB, Kalra A, Muelly M, Reicher JJ

•papers•Aug 6 2025

Interstitial lung disease (ILD) prognostication incorporates clinical history, pulmonary function testing (PFTs), and chest CT pattern classifications. The machine learning classifier, Fibresolve, includes a model to help detect CT patterns associated with idiopathic pulmonary fibrosis (IPF). We developed and tested new Fibresolve software to predict outcomes in patients with ILD. Fibresolve uses a transformer (ViT) algorithm to analyze CT imaging that additionally embeds PFTs, age, and sex to produce an overall risk score. The model was trained to optimize risk score in a dataset of 602 subjects designed to maximize predictive performance via Cox proportional hazards. Validation was completed with the first hazard ratio assessment dataset, then tested in a second datatest set. 61 % of 220 subjects died in the validation set's study period, whereas 40 % of the 407 subjects died in the second dataset's. The validation dataset's mortality hazard ratio (HR) was 3.66 (95 % CI: 2.09-6.42) and 4.66 (CI: 2.47-8.77) for the moderate and high-risk groups. In the second dataset, Fibresolve was a predictor of mortality at initial visit, with a HR of 2.79 (1.73-4.49) and 5.82 (3.53-9.60) in the moderate and high-risk groups. Similar predictive performance was seen at follow-up visits, as well as with changes in the Fibresolve scores over sequential visits. Fibresolve predicts mortality by automatically assessing combined CT, PFTs, age, and sex into a ViT model. The new software algorithm affords accurate prognostication and demonstrates the ability to detect clinical changes over time.

CT Classification Chest Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Application of prediction model based on CT radiomics in prognosis of patients with non-small cell lung cancer.

Peng Z, Wang Y, Qi Y, Hu H, Fu Y, Li J, Li W, Li Z, Guo W, Shen C, Jiang J, Yang B

•papers•Aug 6 2025

To establish and validate the utility of computed tomography (CT) radiomics for the prognosis of patients with non-small cell lung cancer (NSCLC). Overall, 215 patients with pathologic diagnosis of NSCLC were included, chest CT images and clinical data were collected before treatment, and follow-up was conducted to assess brain metastasis and survival. Radiomics characteristics were extracted from the chest CT lung window images of each patient, key characteristics were screened, the radiomics score (Radscore) was calculated, and radiomics, clinical, and combined models were constructed using clinically independent predictive factors. A nomogram was constructed based on the final joint model to visualize prediction results. Predictive efficacy was evaluated using the concordance index (C-index), and survival (Kaplan-Meier) and calibration curves were drawn to further evaluate predictive efficacy. The training set included 151 patients (43 with brain metastasis and 108 without brain metastasis) and 64 patients (18 with brain metastasis and 46 without). Multivariate analysis revealed that lymph node metastasis, lymphocyte percentage, and neuron-specific enolase (NSE) were independent predictors of brain metastasis in patients with NSCLC. The area under the curve (AUC) of the these models were 0.733, 0.836, and 0.849, respectively, in the training set and were 0.739, 0.779, and 0.816, respectively, in the validation set. Multivariate Cox regression analysis revealed that the number of brain metastases, distant metastases elsewhere, and C-reactive protein levels were independent predictors of postoperative survival in patients with brain metastases (<i>P</i> < 0.05). The calibration curve exhibited that the predicted values of the prognostic prediction model agreed well with the actual values. The model based on CT radiomics characteristics can effectively predict NSCLC brain metastasis and its prognosis and provide guidance for individualized treatment of NSCLC patients.

CT Classification Chest Retrospective Clinical In Silico Academic Lab

Benchmarking Uncertainty and its Disentanglement in multi-label Chest X-Ray Classification

Simon Baur, Wojciech Samek, Jackie Ma

•preprint•Aug 6 2025

Reliable uncertainty quantification is crucial for trustworthy decision-making and the deployment of AI models in medical imaging. While prior work has explored the ability of neural networks to quantify predictive, epistemic, and aleatoric uncertainties using an information-theoretical approach in synthetic or well defined data settings like natural image classification, its applicability to real life medical diagnosis tasks remains underexplored. In this study, we provide an extensive uncertainty quantification benchmark for multi-label chest X-ray classification using the MIMIC-CXR-JPG dataset. We evaluate 13 uncertainty quantification methods for convolutional (ResNet) and transformer-based (Vision Transformer) architectures across a wide range of tasks. Additionally, we extend Evidential Deep Learning, HetClass NNs, and Deep Deterministic Uncertainty to the multi-label setting. Our analysis provides insights into uncertainty estimation effectiveness and the ability to disentangle epistemic and aleatoric uncertainties, revealing method- and architecture-specific strengths and limitations.

X-Ray Classification Chest Methodology In Silico Benchmark SOTA

Quantum Federated Learning in Healthcare: The Shift from Development to Deployment and from Models to Data.

Bhatia AS, Kais S, Alam MA

•papers•Aug 6 2025

Healthcare organizations have a high volume of sensitive data and traditional technologies have limited storage capacity and computational resources. The prospect of sharing healthcare data for machine learning is more arduous due to firm regulations related to patient privacy. In recent years, federated learning has offered a solution to accelerate distributed machine learning addressing concerns related to data privacy and governance. Currently, the blend of quantum computing and machine learning has experienced significant attention from academic institutions and research communities. The ultimate objective of this work is to develop a federated quantum machine learning framework (FQML) to tackle the optimization, security, and privacy challenges in the healthcare industry for medical imaging tasks. In this work, we proposed federated quantum convolutional neural networks (QCNNs) with distributed training across edge devices. To demonstrate the feasibility of the proposed FQML framework, we performed extensive experiments on two benchmark medical datasets (Pneumonia MNIST, and CT kidney disease analysis), which are non-independently and non-identically partitioned among the healthcare institutions/clients. The proposed framework is validated and assessed via large-scale simulations. Based on our results, the quantum simulation experiments achieve performance levels on par with well-known classical CNN models, 86.3% accuracy on the pneumonia dataset and 92.8% on the CT-kidney dataset, while requiring fewer model parameters and consuming less data. Moreover, the client selection mechanism is proposed to reduce the computation overhead at each communication round, which effectively improves the convergence rate.

Mixed Modality Classification Chest Methodology In Silico Academic Lab Benchmark SOTA

The development of a multimodal prediction model based on CT and MRI for the prognosis of pancreatic cancer.

Dou Z, Lin J, Lu C, Ma X, Zhang R, Zhu J, Qin S, Xu C, Li J

•papers•Aug 6 2025

To develop and validate a hybrid radiomics model to predict the overall survival in pancreatic cancer patients and identify risk factors that affect patient prognosis. We conducted a retrospective analysis of 272 pancreatic cancer patients diagnosed at the First Affiliated Hospital of Soochow University from January 2013 to December 2023, and divided them into a training set and a test set at a ratio of 7:3. Pre-treatment contrast-enhanced computed tomography (CT), magnetic resonance imaging (MRI) images, and clinical features were collected. Dimensionality reduction was performed on the radiomics features using principal component analysis (PCA), and important features with non-zero coefficients were selected using the least absolute shrinkage and selection operator (LASSO) with 10-fold cross-validation. In the training set, we built clinical prediction models using both random survival forests (RSF) and traditional Cox regression analysis. These models included a radiomics model based on contrast-enhanced CT, a radiomics model based on MRI, a clinical model, 3 bimodal models combining two types of features, and a multimodal model combining radiomics features with clinical features. Model performance evaluation in the test set was based on two dimensions: discrimination and calibration. In addition, risk stratification was performed in the test set based on predicted risk scores to evaluate the model's prognostic utility. The RSF-based hybrid model performed best with a C-index of 0.807 and a Brier score of 0.101, outperforming the COX hybrid model (C-index of 0.726 and a Brier score of 0.145) and other unimodal and bimodal models. The SurvSHAP(t) plot highlighted CA125 as the most important variable. In the test set, patients were stratified into high- and low-risk groups based on the predicted risk scores, and Kaplan-Meier analysis demonstrated a significant survival difference between the two groups (p < 0.0001). A multi-modal model using radiomics based on clinical tabular data and contrast-enhanced CT and MRI was developed by RSF, presenting strengths in predicting prognosis in pancreatic cancer patients.

Mixed Modality Classification Abdominal Retrospective Clinical In Silico Academic Lab

AI-derived CT biomarker score for robust COVID-19 mortality prediction across multiple waves and regions using machine learning.

De Smet K, De Smet D, De Jaeger P, Dewitte J, Martens GA, Buls N, De Mey J

•papers•Aug 6 2025

This study aimed to develop a simple, interpretable model using routinely available data for predicting COVID-19 mortality at admission, addressing limitations of complex models, and to provide a statistically robust framework for controlled clinical use, managing model uncertainty for responsible healthcare application. Data from Belgium's first COVID-19 wave (UZ Brussel, n = 252) were used for model development. External validation utilized data from unvaccinated patients during the late second and early third waves (AZ Delta, n = 175). Various machine learning methods were trained and compared for diagnostic performance after data preprocessing and feature selection. The final model, the M3-score, incorporated three features: age, white blood cell (WBC) count, and AI-derived total lung involvement (TOTAL<sub>AI</sub>) quantified from CT scans using Icolung software. The M3-score demonstrated strong classification performance in the training cohort (AUC 0.903) and clinically useful performance in the external validation dataset (AUC 0.826), indicating generalizability potential. To enhance clinical utility and interpretability, predicted probabilities were categorized into actionable likelihood ratio (LR) intervals: highly unlikely (LR 0.0), unlikely (LR 0.13), gray zone (LR 0.85), more likely (LR 2.14), and likely (LR 8.19) based on the training cohort. External validation suggested temporal and geographical robustness, though some variability in AUC and LR performance was observed, as anticipated in real-world settings. The parsimonious M3-score, integrating AI-based CT quantification with clinical and laboratory data, offers an interpretable tool for predicting in-hospital COVID-19 mortality, showing robust training performance. Observed performance variations in external validation underscore the need for careful interpretation and further extensive validation across international cohorts to confirm wider applicability and robustness before widespread clinical adoption.

CT Classification Chest Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Predicting language outcome after stroke using machine learning: in search of the big data benefit.

Saranti M, Neville D, White A, Rotshtein P, Hope TMH, Price CJ, Bowman H

•papers•Aug 6 2025

Accurate prediction of post-stroke language outcomes using machine learning offers the potential to enhance clinical treatment and rehabilitation for aphasic patients. This study of 758 English speaking stroke patients from the PLORAS project explores the impact of sample size on the performance of logistic regression and a deep learning (ResNet-18) model in predicting language outcomes from neuroimaging and impairment-relevant tabular data. We assessed the performance of both models on two key language tasks from the Comprehensive Aphasia Test: Spoken Picture Description and Naming, using a learning curve approach. Contrary to expectations, the simpler logistic regression model performed comparably or better than the deep learning model (with overlapping confidence intervals), with both models showing an accuracy plateau around 80% for sample sizes larger than 300 patients. Principal Component Analysis revealed that the dimensionality of the neuroimaging data could be reduced to as few as 20 (or even 2) dominant components without significant loss in accuracy, suggesting that classification may be driven by simple patterns such as lesion size. The study highlights both the potential limitations of current dataset size in achieving further accuracy gains and the need for larger datasets to capture more complex patterns, as some of our results indicate that we might not have reached an absolute classification performance ceiling. Overall, these findings provide insights into the practical use of machine learning for predicting aphasia outcomes and the potential benefits of much larger datasets in enhancing model performance.

MRI Classification Neurological Retrospective Clinical In Silico Academic Lab Benchmark SOTA Reproducibility

Development and validation of the multidimensional machine learning model for preoperative risk stratification in papillary thyroid carcinoma: a multicenter, retrospective cohort study.

Feng JW, Zhang L, Yang YX, Qin RJ, Liu SQ, Qin AC, Jiang Y

•papers•Aug 6 2025

This study aims to develop and validate a multi-modal machine learning model for preoperative risk stratification in papillary thyroid carcinoma (PTC), addressing limitations of current systems that rely on postoperative pathological features. We analyzed 974 PTC patients from three medical centers in China using a multi-modal approach integrating: (1) clinical indicators, (2) immunological indices, (3) ultrasound radiomics features, and (4) CT radiomics features. Our methodology employed gradient boosting machine for feature selection and random forest for classification, with model interpretability provided through SHapley Additive exPlanations (SHAP) analysis. The model was validated on internal (n = 225) and two external cohorts (n = 51, n = 174). The final 15-feature model achieved AUCs of 0.91, 0.84, and 0.77 across validation cohorts, improving to 0.96, 0.95, and 0.89 after cohort-specific refitting. SHAP analysis revealed CT texture features, ultrasound morphological features, and immune-inflammatory markers as key predictors, with consistent patterns across validation sites despite center-specific variations. Subgroup analysis showed superior performance in tumors > 1 cm and patients without extrathyroidal extension. Our multi-modal machine learning approach provides accurate preoperative risk stratification for PTC with robust cross-center applicability. This computational framework for integrating heterogeneous imaging and clinical data demonstrates the potential of multi-modal joint learning in healthcare imaging to transform clinical decision-making by enabling personalized treatment planning.

Mixed Modality Classification Abdominal Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Altered gray matter morphometry in psychogenic erectile dysfunction patients: A Surface-based morphometry study.

Tian Z, Ma Z, Dou B, Huang X, Li G, Chang D, Yin T, Zhang P

•papers•Aug 6 2025

Psychogenic erectile dysfunction (pED) is a prevalent male sexual dysfunction lacking organic etiology. Endeavors have been made in previous studies to disclose the brain pathological mechanisms of pED. However, the cortical morphological characteristics in pED patients remained largely unknown. This study enrolled 50 pED patients and 50 healthy controls (HC). The surface-based morphometry (SBM) analysis was conducted, and the between-group comparisons of the four cortical morphological parameters, including the cortical thickness, sulcus depth, gyrification index, and fractal dimension, were performed to investigate the cortical morphological alterations in pED patients, followed by correlation analysis between clinical data and SBM metrics. Furthermore, a classifier was developed based on a support vector classification algorithm and cortical morphological features to explore the feasibility of discriminating between pED patients and HC at an individual level. The results demonstrated that pED patients manifested consistent alteration in cortical morphology cross metrics in the orbitofrontal cortex, anterior and middle cingulate cortex, dorsolateral prefrontal cortex, and precentral gyrus, which were significantly correlated with the clinical symptoms in pED patients. Additionally, the classifier built based on 11 cortical morphological features achieved an accuracy of 82% in discriminating pED patients from HC. The current study provided new evidence of cortical morphological aberrations in pED patients, which deepened our understanding of the central pathology pattern of pED and was expected to facilitate the objective diagnosis of pED and the development of neuromodulation techniques targeting the alterations above.

MRI Classification Neurological Retrospective Clinical In Silico

Pyramidal attention-based T network for brain tumor classification: a comprehensive analysis of transfer learning approaches for clinically reliable and reliable AI hybrid approaches.

Banerjee T, Chhabra P, Kumar M, Kumar A, Abhishek K, Shah MA

•papers•Aug 6 2025

Brain tumors are a significant challenge to human health as they impair the proper functioning of the brain and the general quality of life, thus requiring clinical intervention through early and accurate diagnosis. Although current state-of-the-art deep learning methods have achieved remarkable progress, there is still a gap in the representation learning of tumor-specific spatial characteristics and the robustness of the classification model on heterogeneous data. In this paper, we introduce a novel Pyramidal Attention-Based bi-partitioned T Network (PABT-Net) that combines the hierarchical pyramidal attention mechanism and T-block based bi-partitioned feature extraction, and a self-convolutional dilated neural classifier as the final task. Such an architecture increases the discriminability of the space and decreases the false forecasting by adaptively focusing on informative areas in brain MRI images. The model was thoroughly tested on three benchmark datasets, Figshare Brain Tumor Dataset, Sartaj Brain MRI Dataset, and Br35H Brain Tumor Dataset, containing 7023 images labeled in four tumor classes: glioma, meningioma, no tumor, and pituitary tumor. It attained an overall classification accuracy of 99.12%, a mean cross-validation accuracy of 98.77%, a Jaccard similarity index of 0.986, and a Cohen's Kappa value of 0.987, indicating superb generalization and clinical stability. The model's effectiveness is also confirmed by tumor-wise classification accuracies: 96.75%, 98.46%, and 99.57% in glioma, meningioma, and pituitary tumors, respectively. Comparative experiments with the state-of-the-art models, including VGG19, MobileNet, and NASNet, were carried out, and ablation studies proved the effectiveness of NASNet incorporation. To capture more prominent spatial-temporal patterns, we investigated hybrid networks, including NASNet with ANN, CNN, LSTM, and CNN-LSTM variants. The framework implements a strict nine-fold cross-validation procedure. It integrates a broad range of measures in its evaluation, including precision, recall, specificity, F1-score, AUC, confusion matrices, and the ROC analysis, consistent across distributions. In general, the PABT-Net model has high potential to be a clinically deployable, interpretable, state-of-the-art automated brain tumor classification model.

MRI Classification Neurological Methodology In Silico Academic Lab Benchmark SOTA

Filter Papers

Tags

Multi-modal machine learning classifier for idiopathic pulmonary fibrosis predicts mortality in interstitial lung diseases.

Application of prediction model based on CT radiomics in prognosis of patients with non-small cell lung cancer.

Benchmarking Uncertainty and its Disentanglement in multi-label Chest X-Ray Classification

Quantum Federated Learning in Healthcare: The Shift from Development to Deployment and from Models to Data.

The development of a multimodal prediction model based on CT and MRI for the prognosis of pancreatic cancer.

AI-derived CT biomarker score for robust COVID-19 mortality prediction across multiple waves and regions using machine learning.

Predicting language outcome after stroke using machine learning: in search of the big data benefit.

Development and validation of the multidimensional machine learning model for preoperative risk stratification in papillary thyroid carcinoma: a multicenter, retrospective cohort study.

Altered gray matter morphometry in psychogenic erectile dysfunction patients: A Surface-based morphometry study.

Pyramidal attention-based T network for brain tumor classification: a comprehensive analysis of transfer learning approaches for clinically reliable and reliable AI hybrid approaches.

Ready to Sharpen Your Edge?