Latest Papers on Radiology AI. Tags: Classification, Order: Best Match, Limit: 10.

Quality appraisal of radiomics-based studies on chondrosarcoma using METhodological RadiomICs Score (METRICS) and Radiomics Quality Score (RQS).

Gitto S, Cuocolo R, Klontzas ME, Albano D, Messina C, Sconfienza LM

•papers•Jun 18 2025

To assess the methodological quality of radiomics-based studies on bone chondrosarcoma using METhodological RadiomICs Score (METRICS) and Radiomics Quality Score (RQS). A literature search was conducted on EMBASE and PubMed databases for research papers published up to July 2024 and focused on radiomics in bone chondrosarcoma, with no restrictions regarding the study aim. Three readers independently evaluated the study quality using METRICS and RQS. Baseline study characteristics were extracted. Inter-reader reliability was calculated using intraclass correlation coefficient (ICC). Out of 68 identified papers, 18 were finally included in the analysis. Radiomics research was aimed at lesion classification (n = 15), outcome prediction (n = 2) or both (n = 1). Study design was retrospective in all papers. Most studies employed MRI (n = 12), CT (n = 3) or both (n = 1). METRICS and RQS adherence rates ranged between 37.3-94.8% and 2.8-44.4%, respectively. Excellent inter-reader reliability was found for both METRICS (ICC = 0.961) and RQS (ICC = 0.975). Among the limitations of the evaluated studies, the absence of prospective studies and deep learning-based analyses was highlighted, along with the limited adherence to radiomics guidelines, use of external testing datasets and open science data. METRICS and RQS are reproducible quality assessment tools, with the former showing higher adherence rates in studies on chondrosarcoma. METRICS is better suited for assessing papers with retrospective design, which is often chosen in musculoskeletal oncology due to the low prevalence of bone sarcomas. Employing quality scoring systems should be promoted in radiomics-based studies to improve methodological quality and facilitate clinical translation. Employing reproducible quality scoring systems, especially METRICS (which shows higher adherence rates than RQS and is better suited for assessing retrospective investigations), is highly recommended to design radiomics-based studies on chondrosarcoma, improve methodological quality and facilitate clinical translation. The low scientific and reporting quality of radiomics studies on chondrosarcoma is the main reason preventing clinical translation. Quality appraisal using METRICS and RQS showed 37.3-94.8% and 2.8-44.4% adherence rates, respectively. Room for improvement was noted in study design, deep learning methods, external testing and open science. Employing reproducible quality scoring systems is recommended to design radiomics studies on bone chondrosarcoma and facilitate clinical translation.

Mixed Modality Classification Musculoskeletal Review In Silico Academic Lab Reproducibility Policy

Applying a multi-task and multi-instance framework to predict axillary lymph node metastases in breast cancer.

Li Y, Chen Z, Ding Z, Mei D, Liu Z, Wang J, Tang K, Yi W, Xu Y, Liang Y, Cheng Y

•papers•Jun 18 2025

Deep learning (DL) models have shown promise in predicting axillary lymph node (ALN) status. However, most existing DL models were classification-only models and did not consider the practical application scenarios of multi-view joint prediction. Here, we propose a Multi-Task Learning (MTL) and Multi-Instance Learning (MIL) framework that simulates the real-world clinical diagnostic scenario for ALN status prediction in breast cancer. Ultrasound images of the primary tumor and ALN (if available) regions were collected, each annotated with a segmentation label. The model was trained on a training cohort and tested on both internal and external test cohorts. The proposed two-stage DL framework using one of the Transformer models, Segformer, as the network backbone, exhibits the top-performing model. It achieved an AUC of 0.832, a sensitivity of 0.815, and a specificity of 0.854 in the internal test cohort. In the external cohort, this model attained an AUC of 0.918, a sensitivity of 0.851 and a specificity of 0.957. The Class Activation Mapping method demonstrated that the DL model correctly identified the characteristic areas of metastasis within the primary tumor and ALN regions. This framework may serve as an effective second reader to assist clinicians in ALN status assessment.

Ultrasound Classification Breast Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Can CTA-based Machine Learning Identify Patients for Whom Successful Endovascular Stroke Therapy is Insufficient?

Jeevarajan JA, Dong Y, Ballekere A, Marioni SS, Niktabe A, Abdelkhaleq R, Sheth SA, Giancardo L

•papers•Jun 18 2025

Despite advances in endovascular stroke therapy (EST) devices and techniques, many patients are left with substantial disability, even if the final infarct volumes (FIVs) remain small. Here, we evaluate the performance of a machine learning (ML) approach using pre-treatment CT angiography (CTA) to identify this cohort of patients that may benefit from additional interventions. We identified consecutive large vessel occlusion (LVO) acute ischemic stroke (AIS) subjects who underwent EST with successful reperfusion in a multicenter prospective registry cohort. We included only subjects with FIV<30mL and recorded 90-day outcome (modified Rankin scale, mRS). A deep learning model was pre-trained and then fine-tuned to predict 90-day mRS 0-2 using pre-treatment CTA images (DSN-CTA model). The primary outcome was the predictive performance of the DSNCTA model compared to a logistic regression model with clinical variables, measured by the area under the receiver operating characteristic curve (AUROC). The DSN-CTA model was pre-trained on 1,542 subjects and then fine-tuned and cross-validated with 48 subjects, all of whom underwent EST with TICI 2b-3 reperfusion. Of this cohort, 56.2% of subjects had 90-day mRS 3-6 despite successful EST and FIV<30mL. The DSN-CTA model showed significantly better performance than a model with clinical variables alone when predicting good 90-day mRS (AUROC 0.81 vs 0.492, p=0.006). The CTA-based machine learning model was able to more reliably predict unexpected poor functional outcome after successful EST and small FIV for patients with LVO AIS compared to standard clinical variables. ML models may identify <i>a priori</i> patients in whom EST-based LVO reperfusion alone is insufficient to improve clinical outcomes. AIS＝ acute ischemic stroke; AUROC＝ area under the receiver operating characteristic curve; DSN-CTA＝ DeepSymNet-v3 model; EST＝ endovascular stroke therapy; FIV＝ final infarct volume; LVO＝ large vessel occlusion; ML＝ machine learning.

CT Classification Neurological Retrospective Clinical In Silico Academic Lab

Multimodal deep learning for predicting unsuccessful recanalization in refractory large vessel occlusion.

González JD, Canals P, Rodrigo-Gisbert M, Mayol J, García-Tornel A, Ribó M

•papers•Jun 18 2025

This study explores a multi-modal deep learning approach that integrates pre-intervention neuroimaging and clinical data to predict endovascular therapy (EVT) outcomes in acute ischemic stroke patients. To this end, consecutive stroke patients undergoing EVT were included in the study, including patients with suspected Intracranial Atherosclerosis-related Large Vessel Occlusion ICAD-LVO and other refractory occlusions. A retrospective, single-center cohort of patients with anterior circulation LVO who underwent EVT between 2017-2023 was analyzed. Refractory LVO (rLVO) defined class, comprised patients who presented any of the following: final angiographic stenosis > 50 %, unsuccessful recanalization (eTICI 0-2a) or required rescue treatments (angioplasty +/- stenting). Neuroimaging data included non-contrast CT and CTA volumes, automated vascular segmentation, and CT perfusion parameters. Clinical data included demographics, comorbidities and stroke severity. Imaging features were encoded using convolutional neural networks and fused with clinical data using a DAFT module. Data were split 80 % for training (with four-fold cross-validation) and 20 % for testing. Explainability methods were used to analyze the contribution of clinical variables and regions of interest in the images. The final sample comprised 599 patients; 481 for training the model (77, 16.0 % rLVO), and 118 for testing (16, 13.6 % rLVO). The best model predicting rLVO using just imaging achieved an AUC of 0.53 ± 0.02 and F1 of 0.19 ± 0.05 while the proposed multimodal model achieved an AUC of 0.70 ± 0.02 and F1 of 0.39 ± 0.02 in testing. Combining vascular segmentation, clinical variables, and imaging data improved prediction performance over single-source models. This approach offers an early alert to procedural complexity, potentially guiding more tailored, timely intervention strategies in the EVT workflow.

CT Classification Neurological Retrospective Clinical In Silico Academic Lab

Diffusion-based Counterfactual Augmentation: Towards Robust and Interpretable Knee Osteoarthritis Grading

Zhe Wang, Yuhua Ru, Aladine Chetouani, Tina Shiang, Fang Chen, Fabian Bauer, Liping Zhang, Didier Hans, Rachid Jennane, William Ewing Palmer, Mohamed Jarraya, Yung Hsin Chen

•preprint•Jun 18 2025

Automated grading of Knee Osteoarthritis (KOA) from radiographs is challenged by significant inter-observer variability and the limited robustness of deep learning models, particularly near critical decision boundaries. To address these limitations, this paper proposes a novel framework, Diffusion-based Counterfactual Augmentation (DCA), which enhances model robustness and interpretability by generating targeted counterfactual examples. The method navigates the latent space of a diffusion model using a Stochastic Differential Equation (SDE), governed by balancing a classifier-informed boundary drive with a manifold constraint. The resulting counterfactuals are then used within a self-corrective learning strategy to improve the classifier by focusing on its specific areas of uncertainty. Extensive experiments on the public Osteoarthritis Initiative (OAI) and Multicenter Osteoarthritis Study (MOST) datasets demonstrate that this approach significantly improves classification accuracy across multiple model architectures. Furthermore, the method provides interpretability by visualizing minimal pathological changes and revealing that the learned latent space topology aligns with clinical knowledge of KOA progression. The DCA framework effectively converts model uncertainty into a robust training signal, offering a promising pathway to developing more accurate and trustworthy automated diagnostic systems. Our code is available at https://github.com/ZWang78/DCA.

X-Ray Classification Musculoskeletal Methodology In Silico Academic Lab Open Code

Deep learning model using CT images for longitudinal prediction of benign and malignant ground-glass nodules.

Yang X, Wang J, Wang P, Li Y, Wen Z, Shang J, Chen K, Tang C, Liang S, Meng W

•papers•Jun 18 2025

To develop and validate a CT image-based multiple time-series deep learning model for the longitudinal prediction of benign and malignant pulmonary ground-glass nodules (GGNs). A total of 486 GGNs from an equal number of patients were included in this research, which took place at two medical centers. Each nodule underwent surgical removal and was confirmed pathologically. The patients were randomly assigned to a training set, validation set, and test set, following a distribution ratio of 7:2:1. We established a transformer-based deep learning framework that leverages multi-temporal CT images for the longitudinal prediction of GGNs, focusing on distinguishing between benign and malignant types. Additionally, we utilized 13 different machine learning algorithms to formulate clinical models, delta-radiomics models, and combined models that merge deep learning with CT semantic features. The predictive capabilities of the models were assessed using the receiver operating characteristic (ROC) curve and the area under the curve (AUC). The multiple time-series deep learning model based on CT images surpassed both the clinical model and the delta-radiomics model, showcasing strong predictive capabilities for GGNs across the training, validation, and test sets, with AUCs of 0.911 (95% CI, 0.879-0.939), 0.809 (95% CI,0.715-0.908), and 0.817 (95% CI,0.680-0.937), respectively. Furthermore, the models that integrated deep learning with CT semantic features achieved the highest performance, resulting in AUCs of 0.960 (95% CI, 0.912-0.977), 0.878 (95% CI,0.801-0.942), and 0.890(95% CI, 0.790-0.968). The multiple time-series deep learning model utilizing CT images was effective in predicting benign and malignant GGNs.

CT Classification Chest Retrospective Clinical In Silico Academic Lab

Multimodal MRI Marker of Cognition Explains the Association Between Cognition and Mental Health in UK Biobank

Buianova, I., Silvestrin, M., Deng, J., Pat, N.

•preprint•Jun 18 2025

BackgroundCognitive dysfunction often co-occurs with psychopathology. Advances in neuroimaging and machine learning have led to neural indicators that predict individual differences in cognition with reasonable performance. We examined whether these neural indicators explain the relationship between cognition and mental health in the UK Biobank cohort (n > 14000). MethodsUsing machine learning, we quantified the covariation between general cognition and 133 mental health indices and derived neural indicators of cognition from 72 neuroimaging phenotypes across diffusion-weighted MRI (dwMRI), resting-state functional MRI (rsMRI), and structural MRI (sMRI). With commonality analyses, we investigated how much of the cognition-mental health covariation is captured by each neural indicator and neural indicators combined within and across MRI modalities. ResultsThe predictive association between mental health and cognition was at out-of-sample r = 0.3. Neuroimaging phenotypes captured 2.1% to 25.8% of the cognition-mental health covariation. The highest proportion of variance explained by dwMRI was attributed to the number of streamlines connecting cortical regions (19.3%), by rsMRI through functional connectivity between 55 large-scale networks (25.8%), and by sMRI via the volumetric characteristics of subcortical structures (21.8%). Combining neuroimaging phenotypes within modalities improved the explanation to 25.5% for dwMRI, 29.8% for rsMRI, and 31.6% for sMRI, and combining them across all MRI modalities enhanced the explanation to 48%. ConclusionsWe present an integrated approach to derive multimodal MRI markers of cognition that can be transdiagnostically linked to psychopathology. This demonstrates that the predictive ability of neural indicators extends beyond the prediction of cognition itself, enabling us to capture the cognition-mental health covariation.

MRI Classification Neurological Retrospective Clinical In Silico Academic Lab

Comparative analysis of transformer-based deep learning models for glioma and meningioma classification.

Nalentzi K, Gerogiannis K, Bougias H, Stogiannos N, Papavasileiou P

•papers•Jun 18 2025

This study compares the classification accuracy of novel transformer-based deep learning models (ViT and BEiT) on brain MRIs of gliomas and meningiomas through a feature-driven approach. Meta's Segment Anything Model was used for semi-automatic segmentation, therefore proposing a total neural network-based workflow for this classification task. ViT and BEiT models were finetuned to a publicly available brain MRI dataset. Gliomas/meningiomas cases (625/507) were used for training and 520 cases (260/260; gliomas/meningiomas) for testing. The extracted deep radiomic features from ViT and BEiT underwent normalization, dimensionality reduction based on the Pearson correlation coefficient (PCC), and feature selection using analysis of variance (ANOVA). A multi-layer perceptron (MLP) with 1 hidden layer, 100 units, rectified linear unit activation, and Adam optimizer was utilized. Hyperparameter tuning was performed via 5-fold cross-validation. The ViT model achieved the highest AUC on the validation dataset using 7 features, yielding an AUC of 0.985 and accuracy of 0.952. On the independent testing dataset, the model exhibited an AUC of 0.962 and an accuracy of 0.904. The BEiT model yielded an AUC of 0.939 and an accuracy of 0.871 on the testing dataset. This study demonstrates the effectiveness of transformer-based models, especially ViT, for glioma and meningioma classification, achieving high AUC scores and accuracy. However, the study is limited by the use of a single dataset, which may affect generalizability. Future work should focus on expanding datasets and further optimizing models to improve performance and applicability across different institutions. This study introduces a feature-driven methodology for glioma and meningioma classification, showcasing advancements in the accuracy and model robustness of transformer-based models.

MRI Classification Neurological Retrospective Clinical In Silico Academic Lab Benchmark SOTA

USING ARTIFICIAL INTELLIGENCE TO PREDICT TREATMENT OUTCOMES IN PATIENTS WITH NEUROGENIC OVERACTIVE BLADDER AND MULTIPLE SCLEROSIS

Chang, O., Lee, J., Lane, F., Demetriou, M., Chang, P.

•preprint•Jun 18 2025

Introduction and ObjectivesMany women with multiple sclerosis (MS) experience neurogenic overactive bladder (NOAB) characterized by urinary frequency, urinary urgency and urgency incontinence. The objective of the study was to create machine learning (ML) models utilizing clinical and imaging data to predict NOAB treatment success stratified by treatment type. MethodsThis was a retrospective cohort study of female patients with diagnosis of NOAB and MS seen at a tertiary academic center from 2017-2022. Clinical and imaging data were extracted. Three types of NOAB treatment options evaluated included behavioral therapy, medication therapy and minimally invasive therapies. The primary outcome - treatment success was defined as > 50% reduction in urinary frequency, urinary urgency or a subjective perception of treatment success. For the construction of the logistic regression ML models, bivariate analyses were performed with backward selection of variables with p-values of < 0.10 and clinically relevant variables applied. For ML, the cohort was split into a training dataset (70%) and a test dataset (30%). Area under the curve (AUC) scores are calculated to evaluate model performance. ResultsThe 110 patients included had a mean age of patients were 59 years old (SD 14 years), with a predominantly White cohort (91.8%), post-menopausal (68.2%). Patients were stratified by NOAB treatment therapy type received with 70 patients (63.6%) at behavioral therapy, 58 (52.7%) with medication therapy and 44 (40%) with minimally invasive therapies. On MRI brain imaging, 63.6% of patients had > 20 lesions though majority were not active lesions. The lesions were mostly located within the supratentorial (94.5%), infratentorial (68.2%) and 58.2 infratentorial brain (63.8%) as well as in the deep white matter (53.4%). For MRI spine imaging, most of the lesions were in the cervical spine (71.8%) followed by thoracic spine (43.7%) and lumbar spine (6.4%).10.3%). After feature selection, the top 10 highest ranking features were used to train complimentary LASSO-regularized logistic regression (LR) and extreme gradient-boosted tree (XGB) models. The top-performing LR models for predicting response to behavioral, medication, and minimally invasive therapies yielded AUC values of 0.74, 0.76, and 0.83, respectively. ConclusionsUsing these top-ranked features, LR models achieved AUC values of 0.74-0.83 for prediction of treatment success based on individual factors. Further prospective evaluation is needed to better characterize and validate these identified associations.

MRI Classification Neurological Retrospective Clinical In Silico Academic Lab

Development and interpretation of machine learning-based prognostic models for predicting high-risk prognostic pathological components in pulmonary nodules: integrating clinical features, serum tumor marker and imaging features.

Wang D, Qiu J, Li R, Tian H

•papers•Jun 17 2025

With the improvement of imaging, the screening rate of Pulmonary nodules (PNs) has further increased, but their identification of High-Risk Prognostic Pathological Components (HRPPC) is still a major challenge. In this study, we aimed to build a multi-parameter machine learning predictive model to improve the discrimination accuracy of HRPPC. This study included 816 patients with ≤ 3 cm pulmonary nodules with clear pathology and underwent pulmonary resection. High-resolution chest CT images, clinicopathological characteristics were collected from patients. Lasso regression was utilized in order to identify key features, and a machine learning prediction model was constructed based on the screened key features. The recognition ability of the prediction model was evaluated using (ROC) curves and confusion matrices. Model calibration ability was evaluated using calibration curves. Decision curve analysis (DCA) was used to evaluate the value of the model for clinical applications. Use SHAP values for interpreting predictive models. A total of 816 patients were included in this study, of which 112 (13.79%) had HRPPC of pulmonary nodules. By selecting key variables through Lasso recursive feature elimination, we finally identified 13 key relevant features. The XGB model performed the best, with an area under the ROC curve (AUC) of 0.930 (95% CI: 0.906-0.954) in the training cohort and 0.835 (95% CI: 0.774-0.895) in the validation cohort, indicating that the XGB model had excellent predictive performance. In addition, the calibration curves of the XGB model showed good calibration in both cohorts. DCA demonstrated that the predictive model had a positive benefit in general clinical decision-making. The SHAP values identified the top 3 predictors affecting the HRPPC of PNs as CT Value, Nodule Long Diameter, and PRO-GRP. Our prediction model for identifying HRPPC in PNs has excellent discrimination, calibration and clinical utility. Thoracic surgeons could make relatively reliable predictions of HRPPC in PNs without the possibility of invasive testing.

CT Classification Chest Retrospective Clinical In Silico Academic Lab

Quality appraisal of radiomics-based studies on chondrosarcoma using METhodological RadiomICs Score (METRICS) and Radiomics Quality Score (RQS).

Applying a multi-task and multi-instance framework to predict axillary lymph node metastases in breast cancer.

Can CTA-based Machine Learning Identify Patients for Whom Successful Endovascular Stroke Therapy is Insufficient?

Multimodal deep learning for predicting unsuccessful recanalization in refractory large vessel occlusion.

Diffusion-based Counterfactual Augmentation: Towards Robust and Interpretable Knee Osteoarthritis Grading

Deep learning model using CT images for longitudinal prediction of benign and malignant ground-glass nodules.

Multimodal MRI Marker of Cognition Explains the Association Between Cognition and Mental Health in UK Biobank

Comparative analysis of transformer-based deep learning models for glioma and meningioma classification.

USING ARTIFICIAL INTELLIGENCE TO PREDICT TREATMENT OUTCOMES IN PATIENTS WITH NEUROGENIC OVERACTIVE BLADDER AND MULTIPLE SCLEROSIS

Development and interpretation of machine learning-based prognostic models for predicting high-risk prognostic pathological components in pulmonary nodules: integrating clinical features, serum tumor marker and imaging features.

Ready to Sharpen Your Edge?