Latest Papers on Radiology AI. Tags: Benchmark SOTA

Multi-encoder self-adaptive hard attention network with maximum intensity projections for lung nodule segmentation.

Usman M, Rehman A, Ur Rehman A, Shahid A, Khan TM, Razzak I, Chung M, Shin YG

•papers•Sep 14 2025

Accurate lung nodule segmentation is crucial for early-stage lung cancer diagnosis, as it can substantially enhance patient survival rates. Computed tomography (CT) images are widely employed for early diagnosis in lung nodule analysis. However, the heterogeneity of lung nodules, size diversity, and the complexity of the surrounding environment pose challenges for developing robust nodule segmentation methods. In this study, we propose an efficient end-to-end framework, the Multi-Encoder Self-Adaptive Hard Attention Network (MESAHA-Net), which consists of three encoding paths, an attention block, and a decoder block that assimilates CT slice patches with both forward and backward maximum intensity projection (MIP) images. This synergy affords a profound contextual understanding of lung nodules and also results in a deluge of features. To manage the profusion of features generated, we incorporate a self-adaptive hard attention mechanism guided by region of interest (ROI) masks centered on nodular regions, which MESAHA-Net autonomously produces. The network sequentially undertakes slice-by-slice segmentation, emphasizing nodule regions to produce precise three-dimensional (3D) segmentation. The proposed framework has been comprehensively evaluated on the Lung Image Database Consortium and Image Database Resource Initiative (LIDC-IDRI) dataset, the largest publicly available dataset for lung nodule segmentation. The results demonstrate that our approach is highly robust across various lung nodule types, outperforming previous state-of-the-art techniques in terms of segmentation performance and computational complexity, making it suitable for real-time clinical implementation of artificial intelligence (AI)-driven diagnostic tools.

CT Segmentation Chest Methodology In Silico Academic Lab Benchmark SOTA Open Dataset

Multiparametric magnetic resonance imaging of deep learning-based super-resolution reconstruction for predicting histopathologic grade in hepatocellular carcinoma.

Wang ZZ, Song SM, Zhang G, Chen RQ, Zhang ZC, Liu R

•papers•Sep 14 2025

Deep learning-based super-resolution (SR) reconstruction can obtain high-quality images with more detailed information. To compare multiparametric normal-resolution (NR) and SR magnetic resonance imaging (MRI) in predicting the histopathologic grade in hepatocellular carcinoma. We retrospectively analyzed a total of 826 patients from two medical centers (training 459; validation 196; test 171). T2-weighted imaging, diffusion-weighted imaging, and portal venous phases were collected. Tumor segmentations were conducted automatically by 3D U-Net. Based on generative adversarial network, we utilized 3D SR reconstruction to produce SR MRI. Radiomics models were developed and validated by XGBoost and Catboost. The predictive efficiency was demonstrated by calibration curves, decision curve analysis, area under the curve (AUC) and net reclassification index (NRI). We extracted 3045 radiomic features from both NR and SR MRI, retaining 29 and 28 features, respectively. For XGBoost models, SR MRI yielded higher AUC value than NR MRI in the validation and test cohorts (0.83 <i>vs</i> 0.79; 0.80 <i>vs</i> 0.78), respectively. Consistent trends were seen in CatBoost models: SR MRI achieved AUCs of 0.89 and 0.80 compared to NR MRI's 0.81 and 0.76. NRI indicated that the SR MRI models could improve the prediction accuracy by -1.6% to 20.9% compared to the NR MRI models. Deep learning-based SR MRI could improve the predictive performance of histopathologic grade in HCC. It may be a powerful tool for better stratification management for patients with operable HCC.

MRI Reconstruction Abdominal Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Toward Next-generation Medical Vision Backbones: Modeling Finer-grained Long-range Visual Dependency

Mingyuan Meng

•preprint•Sep 14 2025

Medical Image Computing (MIC) is a broad research topic covering both pixel-wise (e.g., segmentation, registration) and image-wise (e.g., classification, regression) vision tasks. Effective analysis demands models that capture both global long-range context and local subtle visual characteristics, necessitating fine-grained long-range visual dependency modeling. Compared to Convolutional Neural Networks (CNNs) that are limited by intrinsic locality, transformers excel at long-range modeling; however, due to the high computational loads of self-attention, transformers typically cannot process high-resolution features (e.g., full-scale image features before downsampling or patch embedding) and thus face difficulties in modeling fine-grained dependency among subtle medical image details. Concurrently, Multi-layer Perceptron (MLP)-based visual models are recognized as computation/memory-efficient alternatives in modeling long-range visual dependency but have yet to be widely investigated in the MIC community. This doctoral research advances deep learning-based MIC by investigating effective long-range visual dependency modeling. It first presents innovative use of transformers for both pixel- and image-wise medical vision tasks. The focus then shifts to MLPs, pioneeringly developing MLP-based visual models to capture fine-grained long-range visual dependency in medical images. Extensive experiments confirm the critical role of long-range dependency modeling in MIC and reveal a key finding: MLPs provide feasibility in modeling finer-grained long-range dependency among higher-resolution medical features containing enriched anatomical/pathological details. This finding establishes MLPs as a superior paradigm over transformers/CNNs, consistently enhancing performance across various medical vision tasks and paving the way for next-generation medical vision backbones.

Mixed Modality Segmentation Methodology In Silico Academic Lab Benchmark SOTA

Chat GPT-4 shows high agreement in MRI protocol selection compared to board-certified neuroradiologists.

Bendella Z, Wichtmann BD, Clauberg R, Keil VC, Lehnen NC, Haase R, Sáez LC, Wiest IC, Kather JN, Endler C, Radbruch A, Paech D, Deike K

•papers•Sep 13 2025

The aim of this study was to determine whether ChatGPT-4 can correctly suggest MRI protocols and additional MRI sequences based on real-world Radiology Request Forms (RRFs) as well as to investigate the ability of ChatGPT-4 to suggest time saving protocols. Retrospectively, 1,001 RRFs of our Department of Neuroradiology (in-house dataset), 200 RRFs of an independent Department of General Radiology (independent dataset) and 300 RRFs from an external, foreign Department of Neuroradiology (external dataset) were included. Patients' age, sex, and clinical information were extracted from the RRFs and used to prompt ChatGPT- 4 to choose an adequate MRI protocol from predefined institutional lists. Four independent raters then assessed its performance. Additionally, ChatGPT-4 was tasked with creating case-specific protocols aimed at saving time. Two and 7 of 1,001 protocol suggestions of ChatGPT-4 were rated "unacceptable" in the in-house dataset for reader 1 and 2, respectively. No protocol suggestions were rated "unacceptable" in both the independent and external dataset. When assessing the inter-reader agreement, Coheńs weighted ĸ ranged from 0.88 to 0.98 (each p < 0.001). ChatGPT-4's freely composed protocols were approved in 766/1,001 (76.5 %) and 140/300 (46.67 %) cases of the in-house and external dataset with mean time savings (standard deviation) of 3:51 (minutes:seconds) (±2:40) minutes and 2:59 (±3:42) minutes per adopted in-house and external MRI protocol. ChatGPT-4 demonstrated a very high agreement with board-certified (neuro-)radiologists in selecting MRI protocols and was able to suggest approved time saving protocols from the set of available sequences.

MRI LLM Radiology Report Neurological Retrospective Clinical In Silico Academic Lab Benchmark SOTA GenAI

Annotation-efficient deep learning detection and measurement of mediastinal lymph nodes in CT.

Olesinski A, Lederman R, Azraq Y, Sosna J, Joskowicz L

•papers•Sep 13 2025

Manual detection and measurement of structures in volumetric scans is routine in clinical practice but is time-consuming and subject to observer variability. Automatic deep learning-based solutions are effective but require a large dataset of manual annotations by experts. We present a novel annotation-efficient semi-supervised deep learning method for automatic detection, segmentation, and measurement of the short axis length (SAL) of mediastinal lymph nodes (LNs) in contrast-enhanced CT (ceCT) scans. Our semi-supervised method combines the precision of expert annotations with the quantity advantages of pseudolabeled data. It uses an ensemble of 3D nnU-Net models trained on a few expert-annotated scans to generate pseudolabels on a large dataset of unannotated scans. The pseudolabels are then filtered to remove false positive LNs by excluding LNs outside the mediastinum and LNs overlapping with other anatomical structures. Finally, a single 3D nnU-Net model is trained using the filtered pseudo-labels. Our method optimizes the ratio of annotated/non-annotated dataset sizes to achieve the desired performance, thus reducing manual annotation effort. Experimental studies on three chest ceCT datasets with a total of 268 annotated scans (1817 LNs), of which 134 scans were used for testing and the remaining for ensemble training in batches of 17, 34, 67, and 134 scans, as well as 710 unannotated scans, show that the semi-supervised models' recall improvements were 11-24% (0.72-0.87) while maintaining comparable precision levels. The best model achieved mean SAL differences of 1.65 ± 0.92 mm for normal LNs and 4.25 ± 4.98 mm for enlarged LNs, both within the observer variability. Our semi-supervised method requires one-fourth to one-eighth less annotations to achieve a performance to supervised models trained on the same dataset for the automatic measurement of mediastinal LNs in chest ceCT. Using pseudolabels with anatomical filtering may be effective to overcome the challenges of the development of AI-based solutions in radiology.

CT Detection Chest Methodology In Silico Academic Lab Benchmark SOTA

Simulating Sinogram-Domain Motion and Correcting Image-Domain Artifacts Using Deep Learning in HR-pQCT Bone Imaging

Farhan Sadik, Christopher L. Newman, Stuart J. Warden, Rachel K. Surowiec

•preprint•Sep 13 2025

Rigid-motion artifacts, such as cortical bone streaking and trabecular smearing, hinder in vivo assessment of bone microstructures in high-resolution peripheral quantitative computed tomography (HR-pQCT). Despite various motion grading techniques, no motion correction methods exist due to the lack of standardized degradation models. We optimize a conventional sinogram-based method to simulate motion artifacts in HR-pQCT images, creating paired datasets of motion-corrupted images and their corresponding ground truth, which enables seamless integration into supervised learning frameworks for motion correction. As such, we propose an Edge-enhanced Self-attention Wasserstein Generative Adversarial Network with Gradient Penalty (ESWGAN-GP) to address motion artifacts in both simulated (source) and real-world (target) datasets. The model incorporates edge-enhancing skip connections to preserve trabecular edges and self-attention mechanisms to capture long-range dependencies, facilitating motion correction. A visual geometry group (VGG)-based perceptual loss is used to reconstruct fine micro-structural features. The ESWGAN-GP achieves a mean signal-to-noise ratio (SNR) of 26.78, structural similarity index measure (SSIM) of 0.81, and visual information fidelity (VIF) of 0.76 for the source dataset, while showing improved performance on the target dataset with an SNR of 29.31, SSIM of 0.87, and VIF of 0.81. The proposed methods address a simplified representation of real-world motion that may not fully capture the complexity of in vivo motion artifacts. Nevertheless, because motion artifacts present one of the foremost challenges to more widespread adoption of this modality, these methods represent an important initial step toward implementing deep learning-based motion correction in HR-pQCT.

CT Reconstruction Musculoskeletal Methodology In Silico Benchmark SOTA

Adapting Medical Vision Foundation Models for Volumetric Medical Image Segmentation via Active Learning and Selective Semi-supervised Fine-tuning

Jin Yang, Daniel S. Marcus, Aristeidis Sotiras

•preprint•Sep 13 2025

Medical Vision Foundation Models (Med-VFMs) have superior capabilities of interpreting medical images due to the knowledge learned from self-supervised pre-training with extensive unannotated images. To improve their performance on adaptive downstream evaluations, especially segmentation, a few samples from target domains are selected randomly for fine-tuning them. However, there lacks works to explore the way of adapting Med-VFMs to achieve the optimal performance on target domains efficiently. Thus, it is highly demanded to design an efficient way of fine-tuning Med-VFMs by selecting informative samples to maximize their adaptation performance on target domains. To achieve this, we propose an Active Source-Free Domain Adaptation (ASFDA) method to efficiently adapt Med-VFMs to target domains for volumetric medical image segmentation. This ASFDA employs a novel Active Learning (AL) method to select the most informative samples from target domains for fine-tuning Med-VFMs without the access to source pre-training samples, thus maximizing their performance with the minimal selection budget. In this AL method, we design an Active Test Time Sample Query strategy to select samples from the target domains via two query metrics, including Diversified Knowledge Divergence (DKD) and Anatomical Segmentation Difficulty (ASD). DKD is designed to measure the source-target knowledge gap and intra-domain diversity. It utilizes the knowledge of pre-training to guide the querying of source-dissimilar and semantic-diverse samples from the target domains. ASD is designed to evaluate the difficulty in segmentation of anatomical structures by measuring predictive entropy from foreground regions adaptively. Additionally, our ASFDA method employs a Selective Semi-supervised Fine-tuning to improve the performance and efficiency of fine-tuning by identifying samples with high reliability from unqueried ones.

Mixed Modality Segmentation Methodology In Silico Academic Lab Benchmark SOTA

Enhancement Without Contrast: Stability-Aware Multicenter Machine Learning for Glioma MRI Imaging

Sajad Amiri, Shahram Taeb, Sara Gharibi, Setareh Dehghanfard, Somayeh Sadat Mehrnia, Mehrdad Oveisi, Ilker Hacihaliloglu, Arman Rahmim, Mohammad R. Salmanpour

•preprint•Sep 13 2025

Gadolinium-based contrast agents (GBCAs) are central to glioma imaging but raise safety, cost, and accessibility concerns. Predicting contrast enhancement from non-contrast MRI using machine learning (ML) offers a safer alternative, as enhancement reflects tumor aggressiveness and informs treatment planning. Yet scanner and cohort variability hinder robust model selection. We propose a stability-aware framework to identify reproducible ML pipelines for multicenter prediction of glioma MRI contrast enhancement. We analyzed 1,446 glioma cases from four TCIA datasets (UCSF-PDGM, UPENN-GB, BRATS-Africa, BRATS-TCGA-LGG). Non-contrast T1WI served as input, with enhancement derived from paired post-contrast T1WI. Using PyRadiomics under IBSI standards, 108 features were extracted and combined with 48 dimensionality reduction methods and 25 classifiers, yielding 1,200 pipelines. Rotational validation was trained on three datasets and tested on the fourth. Cross-validation prediction accuracies ranged from 0.91 to 0.96, with external testing achieving 0.87 (UCSF-PDGM), 0.98 (UPENN-GB), and 0.95 (BRATS-Africa), with an average of 0.93. F1, precision, and recall were stable (0.87 to 0.96), while ROC-AUC varied more widely (0.50 to 0.82), reflecting cohort heterogeneity. The MI linked with ETr pipeline consistently ranked highest, balancing accuracy and stability. This framework demonstrates that stability-aware model selection enables reliable prediction of contrast enhancement from non-contrast glioma MRI, reducing reliance on GBCAs and improving generalizability across centers. It provides a scalable template for reproducible ML in neuro-oncology and beyond.

MRI Image Synthesis Neurological Retrospective Clinical In Silico Academic Lab Reproducibility Benchmark SOTA

SSL-AD: Spatiotemporal Self-Supervised Learning for Generalizability and Adaptability Across Alzheimer's Prediction Tasks and Datasets

Emily Kaczmarek, Justin Szeto, Brennan Nichyporuk, Tal Arbel

•preprint•Sep 12 2025

Alzheimer's disease is a progressive, neurodegenerative disorder that causes memory loss and cognitive decline. While there has been extensive research in applying deep learning models to Alzheimer's prediction tasks, these models remain limited by lack of available labeled data, poor generalization across datasets, and inflexibility to varying numbers of input scans and time intervals between scans. In this study, we adapt three state-of-the-art temporal self-supervised learning (SSL) approaches for 3D brain MRI analysis, and add novel extensions designed to handle variable-length inputs and learn robust spatial features. We aggregate four publicly available datasets comprising 3,161 patients for pre-training, and show the performance of our model across multiple Alzheimer's prediction tasks including diagnosis classification, conversion detection, and future conversion prediction. Importantly, our SSL model implemented with temporal order prediction and contrastive learning outperforms supervised learning on six out of seven downstream tasks. It demonstrates adaptability and generalizability across tasks and number of input images with varying time intervals, highlighting its capacity for robust performance across clinical applications. We release our code and model publicly at https://github.com/emilykaczmarek/SSL-AD.

MRI Classification Neurological Methodology In Silico Open Code Benchmark SOTA

Regional attention-enhanced vision transformer for accurate Alzheimer's disease classification using sMRI data.

Jomeiri A, Habibizad Navin A, Shamsi M

•papers•Sep 12 2025

Alzheimer's disease (AD) poses a significant global health challenge, necessitating early and accurate diagnosis to enable timely intervention. Structural MRI (sMRI) is a key imaging modality for detecting AD-related brain atrophy, yet traditional deep learning models like convolutional neural networks (CNNs) struggle to capture complex spatial dependencies critical for AD diagnosis. This study introduces the Regional Attention-Enhanced Vision Transformer (RAE-ViT), a novel framework designed for AD classification using sMRI data from the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset. RAE-ViT leverages regional attention mechanisms to prioritize disease-critical brain regions, such as the hippocampus and ventricles, while integrating hierarchical self-attention and multi-scale feature extraction to model both localized and global structural patterns. Evaluated on 1152 sMRI scans (255 AD, 521 MCI, 376 NC), RAE-ViT achieved state-of-the-art performance with 94.2 % accuracy, 91.8 % sensitivity, 95.7 % specificity, and an AUC of 0.96, surpassing standard ViTs (89.5 %) and CNN-based models (e.g., ResNet-50: 87.8 %). The model's interpretable attention maps align closely with clinical biomarkers (Dice: 0.89 hippocampus, 0.85 ventricles), enhancing diagnostic reliability. Robustness to scanner variability (92.5 % accuracy on 1.5T scans) and noise (92.5 % accuracy under 10 % Gaussian noise) further supports its clinical applicability. A preliminary multimodal extension integrating sMRI and PET data improved accuracy to 95.8 %. Future work will focus on optimizing RAE-ViT for edge devices, incorporating multimodal data (e.g., PET, fMRI, genetic), and exploring self-supervised and federated learning to enhance generalizability and privacy. RAE-ViT represents a significant advancement in AI-driven AD diagnosis, offering potential for early detection and improved patient outcomes.

MRI Classification Neurological Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Filter Papers

Tags

Multi-encoder self-adaptive hard attention network with maximum intensity projections for lung nodule segmentation.

Multiparametric magnetic resonance imaging of deep learning-based super-resolution reconstruction for predicting histopathologic grade in hepatocellular carcinoma.

Toward Next-generation Medical Vision Backbones: Modeling Finer-grained Long-range Visual Dependency

Chat GPT-4 shows high agreement in MRI protocol selection compared to board-certified neuroradiologists.

Annotation-efficient deep learning detection and measurement of mediastinal lymph nodes in CT.

Simulating Sinogram-Domain Motion and Correcting Image-Domain Artifacts Using Deep Learning in HR-pQCT Bone Imaging

Adapting Medical Vision Foundation Models for Volumetric Medical Image Segmentation via Active Learning and Selective Semi-supervised Fine-tuning

Enhancement Without Contrast: Stability-Aware Multicenter Machine Learning for Glioma MRI Imaging

SSL-AD: Spatiotemporal Self-Supervised Learning for Generalizability and Adaptability Across Alzheimer's Prediction Tasks and Datasets

Regional attention-enhanced vision transformer for accurate Alzheimer's disease classification using sMRI data.

Ready to Sharpen Your Edge?