Latest Papers on Radiology AI. Tags: Classification

Results of the 9th Scientific Workshop of the European Crohn's and Colitis Organisation (ECCO): Artificial Intelligence in Endoscopy, Radiology and Histology in IBD Diagnostics.

Mookhoek A, Sinonque P, Allocca M, Carter D, Ensari A, Iacucci M, Kopylov U, Verstockt B, Baumgart DC, Noor NM, El-Hussuna A, Sahnan K, Marigorta UM, Noviello D, Bossuyt P, Pellino G, Soriano A, de Laffolie J, Daperno M, Raine T, Cleynen I, Sebastian S

•papers•Aug 12 2025

In this review, a comprehensive overview of the current state of artificial intelligence (AI) research in Inflammatory Bowel Disease (IBD) diagnostics in the domains of endoscopy, radiology and histology is presented. Moreover, key considerations for development of AI algorithms in medical image analysis are discussed. AI presents a potential breakthrough in real-time, objective and rapid endoscopic assessment, with implications for predicting disease progression. It is anticipated that, by harmonising multimodal data, AI will transform patient care through early diagnosis, accurate patient profiling and therapeutic response prediction. The ability of AI in cross-sectional medical imaging to improve diagnostic accuracy, automate and enable objective assessment of disease activity and predict clinical outcomes highlights its transformative potential. AI models have consistently outperformed traditional methods of image interpretation, particularly in complex areas such as differentiating IBD subtypes, identifying disease progression and complications. The use of AI in histology is a particularly dynamic research field. Implementation of AI algorithms in clinical practice is still lagging, a major hurdle being the lack of a digital workflow in many pathology institutes. Adoption is likely to start with implementation of automatic disease activity scoring. Beyond matching pathologist performance, algorithms may teach us more about IBD pathophysiology. While AI is set to substantially advance IBD diagnostics, various challenges such as heterogeneous datasets, retrospective designs and assessment of different endpoints must be addressed. Implementation of novel standards of reporting may drive an increase in research quality and overcome these obstacles.

CT Classification Abdominal Review In Silico Consortium Ethics

[Development of a machine learning-based diagnostic model for T-shaped uterus using transvaginal 3D ultrasound quantitative parameters].

Li SJ, Wang Y, Huang R, Yang LM, Lyu XD, Huang XW, Peng XB, Song DM, Ma N, Xiao Y, Zhou QY, Guo Y, Liang N, Liu S, Gao K, Yan YN, Xia EL

•papers•Aug 12 2025

Objective: To develop a machine learning diagnostic model for T-shaped uterus based on quantitative parameters from 3D transvaginal ultrasound. Methods: A retrospective cross-sectional study was conducted, recruiting 304 patients who visited the hysteroscopy centre of Fuxing Hospital, Beijing, China, between July 2021 and June 2024 for reasons such as "infertility or recurrent pregnancy loss" and other adverse obstetric histories. Twelve experts, including seven clinicians and five sonographers, from Fuxing Hospital and Beijing Obstetrics and Gynecology Hospital of Capital Medical University, Peking University People's Hospital, and Beijing Hospital, independently and anonymously assessed the diagnosis of T-shaped uterus using a modified Delphi method. Based on the consensus results, 56 cases were classified into the T-shaped uterus group and 248 cases into the non-T-shaped uterus group. A total of 7 clinical features and 14 sonographic features were initially included. Features demonstrating significant diagnostic impact were selected using 10-fold cross-validated LASSO (Least Absolute Shrinkage and Selection Operator) regression. Four machine learning algorithms [logistic regression (LR), decision tree (DT), random forest (RF), and support vector machine (SVM)] were subsequently implemented to develop T-shaped uterus diagnostic models. Using the Python random module, the patient dataset was randomly divided into five subsets, each maintaining the original class distribution (T-shaped uterus: non-T-shaped uterus ≈ 1∶4) and a balanced number of samples between the two categories. Five-fold cross-validation was performed, with four subsets used for training and one for validation in each round, to enhance the reliability of model evaluation. Model performance was rigorously assessed using established metrics: area under the curve (AUC) of receiver operator characteristic (ROC) curve, sensitivity, specificity, precision, and F1-score. In the RF model, feature importance was assessed by the mean decrease in Gini impurity attributed to each variable. Results: A total of 304 patients had a mean age of (35±4) years, and the age of the T-shaped uterus group was (35±5) years; the age of the non-T-shaped uterus group was (34±4) years.. Eight features with non-zero coefficients were selected by LASSO regression, including average lateral wall indentation width, average lateral wall indentation angle, upper cavity depth, endometrial thickness, uterine cavity area, cavity width at level of lateral wall indentation, angle formed by the bilateral lateral walls, and average cornual angle (coefficient: 0.125, -0.064,-0.037,-0.030,-0.026,-0.025,-0.025 and -0.024, respectively). The RF model showed the best diagnostic performance: in training set, AUC was 0.986 (95%CI: 0.980-0.992), sensitivity was 0.978, specificity 0.946, precision 0.802, and F1-score 0.881; in testing set, AUC was 0.948 (95%CI: 0.911-0.985), sensitivity was 0.873, specificity 0.919, precision 0.716, and F1-score 0.784. RF model feature importance analysis revealed that average lateral wall indentation width, upper cavity depth, and average lateral wall indentation angle were the top three features (over 65% in total), playing a decisive role in model prediction. Conclusion: The machine learning models developed in this study, particularly the RF model, are promising for the diagnosis of T-shaped uterus, offering new perspectives and technical support for clinical practice.

Ultrasound Classification Abdominal Retrospective Clinical In Silico Academic Lab

Are [18F]FDG PET/CT imaging and cell blood count-derived biomarkers robust non-invasive surrogates for tumor-infiltrating lymphocytes in early-stage breast cancer?

Seban RD, Rebaud L, Djerroudi L, Vincent-Salomon A, Bidard FC, Champion L, Buvat I

•papers•Aug 12 2025

Tumor-infiltrating lymphocytes (TILs) are key immune biomarkers associated with prognosis and treatment response in early-stage breast cancer (BC), particularly in the triple-negative subtype. This study aimed to evaluate whether [18F]FDG PET/CT imaging and routine cell blood count (CBC)-derived biomarkers can serve as non-invasive surrogates for TILs, using machine-learning models. We retrospectively analyzed 358 patients with biopsy-proven early-stage invasive BC who underwent pre-treatment [18F]FDG PET/CT imaging. PET-derived biomarkers were extracted from the primary tumor, lymph nodes, and lymphoid organs (spleen and bone marrow). CBC-derived biomarkers included neutrophil-to-lymphocyte ratio (NLR) and platelet-to-lymphocyte ratio (PLR). TILs were assessed histologically and categorized as low (0-10%), intermediate (11-59%), or high (≥ 60%). Correlations were assessed using Spearman's rank coefficient, and classification and regression models were built using several machine-learning algorithms. Tumor SUVmax and tumor SUVmean showed the highest correlation with TIL levels (ρ = 0.29 and 0.30 respectively, p < 0.001 for both), but overall associations between TILs and PET or CBC-derived biomarkers were weak. No CBC-derived biomarker showed significant correlation or discriminative performance. Machine-learning models failed to predict TIL levels with satisfactory accuracy (maximum balanced accuracy = 0.66). Lymphoid organ metrics (SLR, BLR) and CBC-derived parameters did not significantly enhance predictive value. In this study, neither [18F]FDG PET/CT nor routine CBC-derived biomarkers reliably predict TILs levels in early-stage BC. This observation was made in presence of potential scanner-related variability and for a restricted set of usual PET metrics. Future models should incorporate more targeted imaging approaches, such as immunoPET, to non-invasively assess immune infiltration with higher specificity and improve personalized treatment strategies.

PET Classification Breast Retrospective Clinical In Silico

Predicting coronary artery abnormalities in Kawasaki disease: Model development and external validation

Wang, Q., Kimura, Y., Oba, J., Ishikawa, T., Ohnishi, T., Akahoshi, S., Iio, K., Morikawa, Y., Sakurada, K., Kobayashi, T., Miura, M.

•preprint•Aug 12 2025

BackgroundKawasaki disease (KD) is an acute, pediatric vasculitis associated with coronary artery abnormality (CAA) development. Echocardiography at month 1 post-diagnosis remains the standard for CAA surveillance despite limitations, including patient distress and increased healthcare burden. With declining CAA incidence due to improved treatment, the need for routine follow-up imaging is being reconsidered. This study aimed to develop and externally validate models for predicting CAA development and guide the need for echocardiography. MethodsThis study used two prospective multicenter Japanese registries: PEACOCK for model development and internal validation, and Post-RAISE for external validation. The primary outcome was CAA at the month 1 follow-up, defined as a maximum coronary artery Z score (Zmax) [≥] 2. Twenty-nine clinical, laboratory, echocardiographic, and treatment-related variables obtained within one week of diagnosis were selected as predictors. The models included simple models using the previous Zmax as a single predictor, logistic regression models, and machine learning models (LightGBM and XGBoost). Their discrimination, calibration, and clinical utility were assessed. ResultsAfter excluding patients without outcome data, 4,973 and 2,438 patients from PEACOCK and Post-RAISE, respectively, were included. The CAA incidence at month 1 was 5.5% and 6.8% for the respective group. For external validation, a simple model using the Zmax at week 1 produced an area under the curve of 0.79, which failed to improve by more than 0.02 after other variables were added or more complex models were used. Even the best-performing models with a highly sensitive threshold failed to reduce the need for echocardiography at month 1 by more than 30% while maintaining the number of undiagnosed CAA cases to less than ten. The predictive performance declined considerably when the Zmax was omitted from the multivariable models. ConclusionsThe Zmax at week 1 was the strongest predictor of CAA at month 1 post-diagnosis. Even advanced models incorporating additional variables failed to achieve a clinically acceptable trade-off between reducing the need for echocardiography and reducing the number of undiagnosed CAA cases. Until superior predictors are identified, echocardiography at month 1 should remain the standard practice. Clinical PerspectiveO_ST_ABSWhat Is New?C_ST_ABSO_LIThe maximum Z score on echocardiography one week after diagnosis was the strongest of 29 variables for predicting coronary artery abnormalities (CAA) in patients with Kawasaki disease. C_LIO_LIEven the most sensitive models had a suboptimal ability to predict CAA development and reduce the need for imaging studies, suggesting they have limited utility in clinical decision-making. C_LI What Are the Clinical Implications?O_LIUntil more accurate predictors are found or imaging strategies are optimized, performing echocardiography at one-month follow-up should remain the standard of care. C_LI

Ultrasound Classification Cardiac Retrospective Clinical In Silico Academic Lab

Multimodal Deep Learning for ARDS Detection

Broecker, S., Adams, J. Y., Kumar, G., Callcut, R., Ni, Y., Strohmer, T.

•preprint•Aug 12 2025

ObjectivePoor outcomes in acute respiratory distress syndrome (ARDS) can be alleviated with tools that support early diagnosis. Current machine learning methods for detecting ARDS do not take full advantage of the multimodality of ARDS pathophysiology. We developed a multimodal deep learning model that uses imaging data, continuously collected ventilation data, and tabular data derived from a patients electronic health record (EHR) to make ARDS predictions. Materials and MethodsA chest radiograph (x-ray), at least two hours of ventilator waveform (VWD) data within the first 24 hours of intubation, and EHR-derived tabular data were used from 220 patients admitted to the ICU to train a deep learning model. The model uses pretrained encoders for the x-rays and ventilation data and trains a feature extractor on tabular data. Encoded features for a patient are combined to make a single ARDS prediction. Ablation studies for each modality assessed their effect on the models predictive capability. ResultsThe trimodal model achieved an area under the receiver operator curve (AUROC) of 0.86 with a 95% confidence interval of 0.01. This was a statistically significant improvement (p<0.05) over single modality models and bimodal models trained on VWD+tabular and VWD+x-ray data. Discussion and ConclusionOur results demonstrate the potential utility of using deep learning to address complex conditions with heterogeneous data. More work is needed to determine the additive effect of modalities on ARDS detection. Our framework can serve as a blueprint for building performant multimodal deep learning models for conditions with small, heterogeneous datasets.

X-Ray Classification Chest Methodology In Silico

Dynamic Survival Prediction using Longitudinal Images based on Transformer

Bingfan Liu, Haolun Shi, Jiguo Cao

•preprint•Aug 12 2025

Survival analysis utilizing multiple longitudinal medical images plays a pivotal role in the early detection and prognosis of diseases by providing insight beyond single-image evaluations. However, current methodologies often inadequately utilize censored data, overlook correlations among longitudinal images measured over multiple time points, and lack interpretability. We introduce SurLonFormer, a novel Transformer-based neural network that integrates longitudinal medical imaging with structured data for survival prediction. Our architecture comprises three key components: a Vision Encoder for extracting spatial features, a Sequence Encoder for aggregating temporal information, and a Survival Encoder based on the Cox proportional hazards model. This framework effectively incorporates censored data, addresses scalability issues, and enhances interpretability through occlusion sensitivity analysis and dynamic survival prediction. Extensive simulations and a real-world application in Alzheimer's disease analysis demonstrate that SurLonFormer achieves superior predictive performance and successfully identifies disease-related imaging biomarkers.

MRI Classification Neurological Methodology In Silico Academic Lab Breakthrough

Machine learning models for diagnosing lymph node recurrence in postoperative PTC patients: a radiomic analysis.

Pang F, Wu L, Qiu J, Guo Y, Xie L, Zhuang S, Du M, Liu D, Tan C, Liu T

•papers•Aug 12 2025

Postoperative papillary thyroid cancer (PTC) patients often have enlarged cervical lymph nodes due to inflammation or hyperplasia, which complicates the assessment of recurrence or metastasis. This study aimed to explore the diagnostic capabilities of computed tomography (CT) imaging and radiomic analysis to distinguish the recurrence of cervical lymph nodes in patients with PTC postoperatively. A retrospective analysis of 194 PTC patients who underwent total thyroidectomy was conducted, with 98 cases of cervical lymph node recurrence and 96 cases without recurrence. Using 3D Slicer software, Regions of Interest (ROI) were delineated on enhanced venous phase CT images, analyzing 302 positive and 391 negative lymph nodes. These nodes were randomly divided into training and validation sets in a 3:2 ratio. Python was used to extract radiomic features from the ROIs and to develop radiomic models. Univariate and multivariate analyses identified statistically significant risk factors for cervical lymph node recurrence from clinical data, which, when combined with radiomic scores, formed a nomogram to predict recurrence risk. The diagnostic efficacy and clinical utility of the models were assessed using ROC curves, calibration curves, and Decision Curve Analysis (DCA). This study analyzed 693 lymph nodes (302 positive and 391 negative) and identified 35 significant radiomic features through dimensionality reduction and selection. The three machine learning models, including the Lasso regression, Support Vector Machine (SVM), and RF radiomics models, showed.

CT Classification Retrospective Clinical In Silico Academic Lab

Diagnostic performance of ultrasound S-Detect technology in evaluating BI-RADS-4 breast nodules ≤ 20 mm and > 20 mm.

Xing B, Gu C, Fu C, Zhang B, Tan Y

•papers•Aug 12 2025

This study aimed to explore the diagnostic performance of ultrasound S-Detect in differentiating Breast Imaging-Reporting and Data System (BI-RADS) 4 breast nodules ≤ 20 mm and > 20 mm. Between November 2020 and November 2022, a total of 382 breast nodules in 312 patients were classified as BI-RADS-4 by conventional ultrasound. Using pathology results as the gold standard, we applied receiver operator characteristics (ROC), sensitivity (SE), specificity (SP), accuracy (ACC), positive predictive value (PPV), and negative predictive value (NPV) to analyze the diagnostic value of BI-RADS, S-Detect, and the two techniques in combination (Co-Detect) in the diagnosis of BI-RADS 4 breast nodules ≤ 20 mm and > 20 mm. There were 382 BI-RADS-4 nodules, of which 151 were pathologically confirmed as malignant, and 231 as benign. In lesions ≤ 20 mm, the SE, SP, ACC, PPV, NPV, and area under the curve (AUC) of the BI-RADS group were 77.27%, 89.73%, 85.71%, 78.16%, 89.24%, 0.835, respectively. SE, SP, ACC, PPV, NPV, and AUC of the S-Detect group were 92.05%, 78.92%, 83.15%, 67.50%, 95.43%, 0.855, respectively. SE, SP, ACC, PPV, NPV, and AUC of the Co-Detect group were 89.77%, 93.51%, 92.31%, 86.81%, 95.05%, 0.916, respectively. The differences of SE, ACC, NPV, and AUC between the BI-RADS group and the Co-Detect group were statistically significant (P < 0.05). In lesions > 20 mm, SE, SP, ACC, PPV, NPV, and AUC of the BI-RADS group were 88.99%, 89.13%, 88.99%, 91.80%, 85.42%, 0.890, respectively. SE, SP, ACC, PPV, NPV, and AUC of the S-Detect group were 98.41%, 69.57%, 86.24%, 81.58%, 96.97%, 0.840, respectively. SE, SP, ACC, PPV, NPV, and AUC of the Co-Detect group were 98.41%, 91.30%, 95.41%, 93.94%, 97.67%, 0.949, respectively. A total of 166 BI-RADS 4 A nodules were downgraded to category 3 by Co-Detect, with 160 (96.4%) confirmed as benign and 6 (all ≤ 20 mm) as false negatives. Conversely, 25 nodules were upgraded to 4B, of which 19 (76.0%) were malignant. The difference in AUC between the BI-RADS group and the Co-Detect group was statistically significant (P < 0.05). S-Detect combined with BI-RADS is effective in the differential diagnosis of BI-RADS 4 breast nodules ≤ 20 mm and > 20 mm. However, its performance is particularly pronounced in lesions ≤ 20 mm, where it contributes to a significant reduction in unnecessary biopsies.

Ultrasound Classification Breast Retrospective Clinical Clinical Pilot

Development and validation of machine learning models to predict vertebral artery injury by C2 pedicle screws.

Ye B, Sun Y, Chen G, Wang B, Meng H, Shan L

•papers•Aug 12 2025

Cervical 2 pedicle screw (C2PS) fixation is widely used in posterior cervical surgery but carries risks of vertebral artery injury (VAI), a rare yet severe complication. This study aimed to identify risk factors for VAI during C2PS placement and develop a machine learning (ML)-based predictive model to enhance preoperative risk assessment. Clinical and radiological data from 280 patients undergoing head and neck CT angiography were retrospectively analyzed. Three-dimensional reconstructed images simulated C2PS placement, classifying patients into injury (n = 98) and non-injury (n = 182) groups. Fifteen variables, including characteristic of patients and anatomic variables were evaluated. Eight ML algorithms were trained (70% training cohort) and validated (30% validation cohort). Model performance was assessed using AUC, sensitivity, specificity, and SHAP (SHapley Additive exPlanations) for interpretability. Six key risk factors were identified: pedicle diameter, high-riding vertebral artery (HRVA), intra-axial vertebral artery (IAVA), vertebral artery diameter (VAD), distance between the transverse foramen and the posterior end of the vertebral body (TFPEVB) and distance between the vertebral artery and the vertebral body (VAVB). The neural network model (NNet) demonstrated optimal predictive performance, achieving AUCs of 0.929 (training) and 0.936 (validation). SHAP analysis confirmed these variables as primary contributors to VAI risk. This study established an ML-driven predictive model for VAI during C2PS placement, highlighting six critical anatomical and radiological risk factors. Integrating this model into clinical workflows may optimize preoperative planning, reduce complications, and improve surgical outcomes. External validation in multicenter cohorts is warranted to enhance generalizability.

CT Classification Neurological Retrospective Clinical In Silico Academic Lab

Hierarchical Variable Importance with Statistical Control for Medical Data-Based Prediction

Joseph Paillard, Antoine Collas, Denis A. Engemann, Bertrand Thirion

•preprint•Aug 12 2025

Recent advances in machine learning have greatly expanded the repertoire of predictive methods for medical imaging. However, the interpretability of complex models remains a challenge, which limits their utility in medical applications. Recently, model-agnostic methods have been proposed to measure conditional variable importance and accommodate complex non-linear models. However, they often lack power when dealing with highly correlated data, a common problem in medical imaging. We introduce Hierarchical-CPI, a model-agnostic variable importance measure that frames the inference problem as the discovery of groups of variables that are jointly predictive of the outcome. By exploring subgroups along a hierarchical tree, it remains computationally tractable, yet also enjoys explicit family-wise error rate control. Moreover, we address the issue of vanishing conditional importance under high correlation with a tree-based importance allocation mechanism. We benchmarked Hierarchical-CPI against state-of-the-art variable importance methods. Its effectiveness is demonstrated in two neuroimaging datasets: classifying dementia diagnoses from MRI data (ADNI dataset) and analyzing the Berger effect on EEG data (TDBRAIN dataset), identifying biologically plausible variables.

MRI Classification Neurological Methodology In Silico

Filter Papers

Tags

Results of the 9th Scientific Workshop of the European Crohn's and Colitis Organisation (ECCO): Artificial Intelligence in Endoscopy, Radiology and Histology in IBD Diagnostics.

[Development of a machine learning-based diagnostic model for T-shaped uterus using transvaginal 3D ultrasound quantitative parameters].

Are [18F]FDG PET/CT imaging and cell blood count-derived biomarkers robust non-invasive surrogates for tumor-infiltrating lymphocytes in early-stage breast cancer?

Predicting coronary artery abnormalities in Kawasaki disease: Model development and external validation

Multimodal Deep Learning for ARDS Detection

Dynamic Survival Prediction using Longitudinal Images based on Transformer

Machine learning models for diagnosing lymph node recurrence in postoperative PTC patients: a radiomic analysis.

Diagnostic performance of ultrasound S-Detect technology in evaluating BI-RADS-4 breast nodules ≤ 20 mm and > 20 mm.

Development and validation of machine learning models to predict vertebral artery injury by C2 pedicle screws.

Hierarchical Variable Importance with Statistical Control for Medical Data-Based Prediction

Ready to Sharpen Your Edge?