Latest Papers on Radiology AI. Tags: MRI

MultiMAE for Brain MRIs: Robustness to Missing Inputs Using Multi-Modal Masked Autoencoder

Ayhan Can Erdur, Christian Beischl, Daniel Scholz, Jiazhen Pan, Benedikt Wiestler, Daniel Rueckert, Jan C Peeken

•preprint•Sep 14 2025

Missing input sequences are common in medical imaging data, posing a challenge for deep learning models reliant on complete input data. In this work, inspired by MultiMAE [2], we develop a masked autoencoder (MAE) paradigm for multi-modal, multi-task learning in 3D medical imaging with brain MRIs. Our method treats each MRI sequence as a separate input modality, leveraging a late-fusion-style transformer encoder to integrate multi-sequence information (multi-modal) and individual decoder streams for each modality for multi-task reconstruction. This pretraining strategy guides the model to learn rich representations per modality while also equipping it to handle missing inputs through cross-sequence reasoning. The result is a flexible and generalizable encoder for brain MRIs that infers missing sequences from available inputs and can be adapted to various downstream applications. We demonstrate the performance and robustness of our method against an MAE-ViT baseline in downstream segmentation and classification tasks, showing absolute improvement of $10.1$ overall Dice score and $0.46$ MCC over the baselines with missing input sequences. Our experiments demonstrate the strength of this pretraining strategy. The implementation is made available.

MRI Reconstruction Neurological Methodology In Silico Open Code

Dual-Branch Efficient Net Architecture for ACL Tear Detection in Knee MRI

kota, T., Garofalaki, K., Whitely, F., Evdokimenko, E., Smartt, E.

•preprint•Sep 13 2025

We propose a deep learning approach for detecting anterior cruciate ligament (ACL) tears from knee MRI using a dual-branch convolutional architecture. The model independently processes sagittal and coronal MRI sequences using EfficientNet-B2 backbones with spatial attention modules, followed by a late fusion classifier for binary prediction. MRI volumes are standardized to a fixed number of slices, and domain-specific normalization and data augmentation are applied to enhance model robustness. Trained on a stratified 80/20 split of the MRNet dataset, our best model--using the Adam optimizer and a learning rate of 1e-4--achieved a validation AUC of 0.98 and a test AUC of 0.93. These results show strong predictive performance while maintaining computational efficiency. This work demonstrates that accurate diagnosis is achievable using only two anatomical planes and sets the stage for further improvements through architectural enhancements and broader data integration.

MRI Classification Musculoskeletal Retrospective Clinical In Silico

Chat GPT-4 shows high agreement in MRI protocol selection compared to board-certified neuroradiologists.

Bendella Z, Wichtmann BD, Clauberg R, Keil VC, Lehnen NC, Haase R, Sáez LC, Wiest IC, Kather JN, Endler C, Radbruch A, Paech D, Deike K

•papers•Sep 13 2025

The aim of this study was to determine whether ChatGPT-4 can correctly suggest MRI protocols and additional MRI sequences based on real-world Radiology Request Forms (RRFs) as well as to investigate the ability of ChatGPT-4 to suggest time saving protocols. Retrospectively, 1,001 RRFs of our Department of Neuroradiology (in-house dataset), 200 RRFs of an independent Department of General Radiology (independent dataset) and 300 RRFs from an external, foreign Department of Neuroradiology (external dataset) were included. Patients' age, sex, and clinical information were extracted from the RRFs and used to prompt ChatGPT- 4 to choose an adequate MRI protocol from predefined institutional lists. Four independent raters then assessed its performance. Additionally, ChatGPT-4 was tasked with creating case-specific protocols aimed at saving time. Two and 7 of 1,001 protocol suggestions of ChatGPT-4 were rated "unacceptable" in the in-house dataset for reader 1 and 2, respectively. No protocol suggestions were rated "unacceptable" in both the independent and external dataset. When assessing the inter-reader agreement, Coheńs weighted ĸ ranged from 0.88 to 0.98 (each p < 0.001). ChatGPT-4's freely composed protocols were approved in 766/1,001 (76.5 %) and 140/300 (46.67 %) cases of the in-house and external dataset with mean time savings (standard deviation) of 3:51 (minutes:seconds) (±2:40) minutes and 2:59 (±3:42) minutes per adopted in-house and external MRI protocol. ChatGPT-4 demonstrated a very high agreement with board-certified (neuro-)radiologists in selecting MRI protocols and was able to suggest approved time saving protocols from the set of available sequences.

MRI LLM Radiology Report Neurological Retrospective Clinical In Silico Academic Lab Benchmark SOTA GenAI

Enhancement Without Contrast: Stability-Aware Multicenter Machine Learning for Glioma MRI Imaging

Sajad Amiri, Shahram Taeb, Sara Gharibi, Setareh Dehghanfard, Somayeh Sadat Mehrnia, Mehrdad Oveisi, Ilker Hacihaliloglu, Arman Rahmim, Mohammad R. Salmanpour

•preprint•Sep 13 2025

Gadolinium-based contrast agents (GBCAs) are central to glioma imaging but raise safety, cost, and accessibility concerns. Predicting contrast enhancement from non-contrast MRI using machine learning (ML) offers a safer alternative, as enhancement reflects tumor aggressiveness and informs treatment planning. Yet scanner and cohort variability hinder robust model selection. We propose a stability-aware framework to identify reproducible ML pipelines for multicenter prediction of glioma MRI contrast enhancement. We analyzed 1,446 glioma cases from four TCIA datasets (UCSF-PDGM, UPENN-GB, BRATS-Africa, BRATS-TCGA-LGG). Non-contrast T1WI served as input, with enhancement derived from paired post-contrast T1WI. Using PyRadiomics under IBSI standards, 108 features were extracted and combined with 48 dimensionality reduction methods and 25 classifiers, yielding 1,200 pipelines. Rotational validation was trained on three datasets and tested on the fourth. Cross-validation prediction accuracies ranged from 0.91 to 0.96, with external testing achieving 0.87 (UCSF-PDGM), 0.98 (UPENN-GB), and 0.95 (BRATS-Africa), with an average of 0.93. F1, precision, and recall were stable (0.87 to 0.96), while ROC-AUC varied more widely (0.50 to 0.82), reflecting cohort heterogeneity. The MI linked with ETr pipeline consistently ranked highest, balancing accuracy and stability. This framework demonstrates that stability-aware model selection enables reliable prediction of contrast enhancement from non-contrast glioma MRI, reducing reliance on GBCAs and improving generalizability across centers. It provides a scalable template for reproducible ML in neuro-oncology and beyond.

MRI Image Synthesis Neurological Retrospective Clinical In Silico Academic Lab Reproducibility Benchmark SOTA

The comparison of deep learning and radiomics in the prediction of polymyositis.

Wu G, Li B, Li T, Liu L

•papers•Sep 12 2025

T2 weighted magnetic resonance imaging has become a commonly used noninvasive examination method for the diagnosis of Polymyositis (PM). The data regarding the comparison of deep learning and radiomics in the diagnosis of PM is still lacking. This study investigates the feasibility of 3D convolutional neural network (CNN) in the prediction of PM, with comparison to radiomics. A total of 120 patients (with 60 PM) were from center A, and 30 (with 15 PM) were from B, and 46 (with 23 PM) were from C. The data from center A was used as training data, and data from B as validation data, and data from C as external test data. The magnetic resonance radiomics features of rectus femoris were obtained for all cases. The maximum correlation minimum redundancy and least absolute shrinkage and selection operator regression were used before establishing a radiomics score model. A 3D CNN classification model was trained with "monai" based on 150 data with labels. A 3D Unet segmentation model was also trained with "monai" based on 196 original data and their segmentation of rectus femoris. The accuracy on the external test data was compared between 2 methods by using the paired chi-square test. PM and non-PM cases did not differ in age or gender (P > .05). The 3D CNN classification model achieved accuracy of 97% in validation data. The sensitivity, specificity, accuracy and positive predictive value of the 3D CNN classification model in the external test data were 96% (22/23), 91% (21/23), 93% (43/46), and 92% (22/24), respectively. The radiomics score achieved accuracy of 90% in the validation data. The sensitivity, specificity, accuracy, and positive predictive value of the radiomics score in the external test data were 70% (16/23), 65% (15/23), 67% (31/46), and 67% (16/24), respectively, significantly lower than that of CNN model (P = .035). The 3D segmentation model for rectus femoris on T2 weighted magnetic resonance images was obtained with dice similarity coefficient of 0.71. 3D CNN model is not inferior to radiomics score in the prediction of PM. The combination of deep learning and radiomics is recommended for the evaluation of PM in future clinical practice.

MRI Classification Musculoskeletal Retrospective Clinical In Silico Academic Lab

Regional attention-enhanced vision transformer for accurate Alzheimer's disease classification using sMRI data.

Jomeiri A, Habibizad Navin A, Shamsi M

•papers•Sep 12 2025

Alzheimer's disease (AD) poses a significant global health challenge, necessitating early and accurate diagnosis to enable timely intervention. Structural MRI (sMRI) is a key imaging modality for detecting AD-related brain atrophy, yet traditional deep learning models like convolutional neural networks (CNNs) struggle to capture complex spatial dependencies critical for AD diagnosis. This study introduces the Regional Attention-Enhanced Vision Transformer (RAE-ViT), a novel framework designed for AD classification using sMRI data from the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset. RAE-ViT leverages regional attention mechanisms to prioritize disease-critical brain regions, such as the hippocampus and ventricles, while integrating hierarchical self-attention and multi-scale feature extraction to model both localized and global structural patterns. Evaluated on 1152 sMRI scans (255 AD, 521 MCI, 376 NC), RAE-ViT achieved state-of-the-art performance with 94.2 % accuracy, 91.8 % sensitivity, 95.7 % specificity, and an AUC of 0.96, surpassing standard ViTs (89.5 %) and CNN-based models (e.g., ResNet-50: 87.8 %). The model's interpretable attention maps align closely with clinical biomarkers (Dice: 0.89 hippocampus, 0.85 ventricles), enhancing diagnostic reliability. Robustness to scanner variability (92.5 % accuracy on 1.5T scans) and noise (92.5 % accuracy under 10 % Gaussian noise) further supports its clinical applicability. A preliminary multimodal extension integrating sMRI and PET data improved accuracy to 95.8 %. Future work will focus on optimizing RAE-ViT for edge devices, incorporating multimodal data (e.g., PET, fMRI, genetic), and exploring self-supervised and federated learning to enhance generalizability and privacy. RAE-ViT represents a significant advancement in AI-driven AD diagnosis, offering potential for early detection and improved patient outcomes.

MRI Classification Neurological Retrospective Clinical In Silico Academic Lab Benchmark SOTA

SSL-AD: Spatiotemporal Self-Supervised Learning for Generalizability and Adaptability Across Alzheimer's Prediction Tasks and Datasets

Emily Kaczmarek, Justin Szeto, Brennan Nichyporuk, Tal Arbel

•preprint•Sep 12 2025

Alzheimer's disease is a progressive, neurodegenerative disorder that causes memory loss and cognitive decline. While there has been extensive research in applying deep learning models to Alzheimer's prediction tasks, these models remain limited by lack of available labeled data, poor generalization across datasets, and inflexibility to varying numbers of input scans and time intervals between scans. In this study, we adapt three state-of-the-art temporal self-supervised learning (SSL) approaches for 3D brain MRI analysis, and add novel extensions designed to handle variable-length inputs and learn robust spatial features. We aggregate four publicly available datasets comprising 3,161 patients for pre-training, and show the performance of our model across multiple Alzheimer's prediction tasks including diagnosis classification, conversion detection, and future conversion prediction. Importantly, our SSL model implemented with temporal order prediction and contrastive learning outperforms supervised learning on six out of seven downstream tasks. It demonstrates adaptability and generalizability across tasks and number of input images with varying time intervals, highlighting its capacity for robust performance across clinical applications. We release our code and model publicly at https://github.com/emilykaczmarek/SSL-AD.

MRI Classification Neurological Methodology In Silico Open Code Benchmark SOTA

Ex vivo human brain volumetry: Validation of MRI measurements.

Gérin-Lajoie A, Adame-Gonzalez W, Frigon EM, Guerra Sanches L, Nayouf A, Boire D, Dadar M, Maranzano J

•papers•Sep 12 2025

The volume of in vivo human brains is determined with various MRI measurement tools that have not been assessed against a gold standard. The purpose of this study was to validate the MRI brain volumes by scanning ex vivo, in situ specimens, which allows the extraction of the brain after the scan to compare its volume with the gold-standard water displacement method (WDM). The 3T MRI T2-weighted, T1-weighted, and MP2RAGE images of seven anatomical heads fixed with an alcohol-formaldehyde solution were acquired. The gray and white matter were assessed using two methods: (i) a manual intensity-based threshold segmentation using Display (MINC-ToolKit) and (ii) an automatic deep learning-based segmentation tool (SynthSeg). The brains were extracted and their volumes measured with the WDM after the removal of their meninges and a midsagittal cut. Volumes from all methods were compared with the ground truth (WDM volumes) using a repeated-measures analysis of variance. Mean brain volumes, in cubic centimeters, were 1111.14 ± 121.78 for WDM, 1020.29 ± 70.01 for manual T2-weighted, 1056.29 ± 90.54 for automatic T2-weighted, 1094.69 ± 100.51 for automatic T1-weighted, 1066.56 ± 96.52 for automatic magnetization-prepared 2 rapid gradient-echo first inversion time, and 1156.18 ± 121.87 for automatic magnetization-prepared 2 rapid gradient-echo second inversion time. All volumetry methods were significantly different (F = 17.874; p < 0.001) from the WDM volumes, except the automatic T1-weighted volumes. SynthSeg accurately determined the brain volume in ex vivo, in situ T1-weighted MRI scans. The results suggested that given the contrast similarity between the ex vivo and in vivo sequences, the brain volumes of clinical studies are most probably sufficiently accurate, with some degree of underestimation depending on the sequence used.

MRI Segmentation Neurological Retrospective Clinical In Silico Benchmark SOTA

Predicting molecular subtypes of pediatric medulloblastoma using MRI-based artificial intelligence: A systematic review and meta-analysis.

Liu J, Zou Z, He Y, Guo Z, Yi C, Huang B

•papers•Sep 12 2025

This meta-analysis aims to assess the diagnostic performance of artificial intelligence (AI) based on magnetic resonance imaging (MRI) in detecting molecular subtypes of pediatric medulloblastoma (MB) in children. A thorough review of the literature was performed using PubMed, Embase, and Web of Science to locate pertinent studies released prior to October 2024. Selected studies focused on the diagnostic performance of AI based on MRI in detecting molecular subtypes of pediatric MB. A bivariate random-effects model was used to calculate pooled sensitivity and specificity, both with 95% confidence intervals (CI). Study heterogeneity was assessed using I2 statistics. Among the 540 studies determined, eight studies (involving 1195 patients) were included. For the wingless (WNT), the combined sensitivity, specificity, and receiver operating characteristic curve (AUC) based on MRI were 0.73 (95% CI: 0.61-0.83, I2 = 19%), 0.94 (95% CI: 0.79-0.99, I2 = 93%), and 0.80 (95% CI: 0.77-0.83), respectively. For the sonic hedgehog (SHH), the combined sensitivity, specificity, and AUC were 0.64 (95% CI: 0.51-0.75, I2 = 69%), 0.84 (95% CI: 0.80-0.88, I2 = 54%), and 0.85 (95% CI: 0.81-0.88), respectively. For Group 3 (G3), the combined sensitivity, specificity, and AUC were 0.89 (95% CI: 0.52-0.98, I2 = 82%), 0.70 (95% CI: 0.62-0.77, I2 = 44%), and 0.88 (95% CI: 0.84-0.90), respectively. For Group 4 (G4), the combined sensitivity, specificity, and AUC were 0.77 (95% CI: 0.64-0.87, I2 = 54%), 0.91 (95% CI: 0.68-0.98, I2 = 80%), and 0.86 (95% CI: 0.83-0.89), respectively. MRI-based artificial intelligence shows high diagnostic performance in detecting molecular subtypes of pediatric MB. However, all included studies employed retrospective designs, which may introduce potential biases. More researches using external validation datasets are needed to confirm the results and assess their clinical applicability.

MRI Classification Neurological Meta Analysis In Silico

Cardiac Magnetic Resonance Imaging in the German National Cohort (NAKO): Automated Segmentation of Short-Axis Cine Images and Post-Processing Quality Control.

Full PM, Schirrmeister RT, Hein M, Russe MF, Reisert M, Ammann C, Greiser KH, Niendorf T, Pischon T, Schulz-Menger J, Maier-Hein KH, Bamberg F, Rospleszcz S, Schlett CL, Schuppert C

•papers•Sep 12 2025

The prospective, multicenter German National Cohort (NAKO) provides a unique dataset of cardiac magnetic resonance (CMR) cine images. Effective processing of these images requires a robust segmentation and quality control pipeline. A deep learning model for semantic segmentation, based on the nnU-Net architecture, was applied to full-cycle short-axis cine images from 29,908 baseline participants. The primary objective was to determine data on structure and function for both ventricles (LV, RV), including end-diastolic volumes (EDV), end-systolic volumes (ESV), and LV myocardial mass. Quality control measures included a visual assessment of outliers in morphofunctional parameters, inter- and intra-ventricular phase differences, and time-volume curves (TVC). These were adjudicated using a five-point rating scale, ranging from five (excellent) to one (non-diagnostic), with ratings of three or lower subject to exclusion. The predictive value of outlier criteria for inclusion and exclusion was evaluated using receiver operating characteristics analysis. The segmentation model generated complete data for 29,609 participants (incomplete in 1.0%), of which 5,082 cases (17.0%) underwent visual assessment. Quality assurance yielded a sample of 26,899 (90.8%) participants with excellent or good quality, excluding 1,875 participants due to image quality issues and 835 participants due to segmentation quality issues. TVC was the strongest single discriminator between included and excluded participants (AUC: 0.684). Of the two-category combinations, the pairing of TVC and phases provided the greatest improvement over TVC alone (AUC difference: 0.044; p<0.001). The best performance was observed when all three categories were combined (AUC: 0.748). By extending the quality-controlled sample to include mid-level 'acceptable' quality ratings, a total of 28,413 (96.0%) participants could be included. The implemented pipeline facilitated the automated segmentation of an extensive CMR dataset, integrating quality control measures. This methodology ensures that ensuing quantitative analyses are conducted with a diminished risk of bias.

MRI Segmentation Cardiac Retrospective Clinical In Silico Consortium

Filter Papers

Tags

MultiMAE for Brain MRIs: Robustness to Missing Inputs Using Multi-Modal Masked Autoencoder

Dual-Branch Efficient Net Architecture for ACL Tear Detection in Knee MRI

Chat GPT-4 shows high agreement in MRI protocol selection compared to board-certified neuroradiologists.

Enhancement Without Contrast: Stability-Aware Multicenter Machine Learning for Glioma MRI Imaging

The comparison of deep learning and radiomics in the prediction of polymyositis.

Regional attention-enhanced vision transformer for accurate Alzheimer's disease classification using sMRI data.

SSL-AD: Spatiotemporal Self-Supervised Learning for Generalizability and Adaptability Across Alzheimer's Prediction Tasks and Datasets

Ex vivo human brain volumetry: Validation of MRI measurements.

Predicting molecular subtypes of pediatric medulloblastoma using MRI-based artificial intelligence: A systematic review and meta-analysis.

Cardiac Magnetic Resonance Imaging in the German National Cohort (NAKO): Automated Segmentation of Short-Axis Cine Images and Post-Processing Quality Control.

Ready to Sharpen Your Edge?