Latest Papers on Radiology AI. Tags: Benchmark SOTA

MRI annotation using an inversion-based preprocessing for CT model adaptation.

Häntze H, Xu L, Rattunde MN, Donle L, Dorfner FJ, Hering A, Nawabi J, Adams LC, Bressem KK

•papers•Sep 19 2025

Annotating new classes in MRI images is time-consuming. Refining presegmented structures can accelerate this process. Many target classes lacking in MRI are supported by computed tomography (CT) models, but translating MRI to synthetic CT images is challenging. We demonstrate that CT segmentation models can create accurate MRI presegmentations, with or without image inversion. We retrospectively investigated the performance of two CT-trained models on MRI images: a general multiclass model (TotalSegmentator); and a specialized renal tumor model trained in-house. Both models were applied to 100 T1-weighted (T1w) and 100 T2-weighted fat-saturated (T2wfs) MRI sequences from 100 patients (50 male). Segmentation quality was evaluated on both raw and intensity-inverted sequences using Dice similarity coefficients (DSC), with reference annotations comprising manual kidney tumor annotations and automatically generated segmentations for 24 abdominal structures. Segmentation quality varied by MRI sequence and anatomical structure. Both models accurately segmented kidneys in T2wfs sequences without preprocessing (TotalSegmentator DSC 0.60), but TotalSegmentator failed to segment blood vessels and muscles. In T1w sequences, intensity inversion significantly improved TotalSegmentator performance, increasing the mean DSC across 24 structures from 0.04 to 0.56 (p < 0.001). Kidney tumor segmentation demonstrated poor performance in T2wfs sequences regardless of preprocessing. In T1w sequences, inversion improved tumor segmentation DSC from 0.04 to 0.42 (p < 0.001). CT-trained models can generalize to MRI when supported by image augmentation. Inversion preprocessing enabled segmentation of renal cell carcinoma in T1w MRI using a CT-trained model. CT models might be transferable to the MRI domain. CT-trained artificial intelligence models can be adapted for MRI segmentation using simple preprocessing, potentially reducing manual annotation efforts and accelerating the development of AI-assisted tools for MRI analysis in research and future clinical practice. CT segmentation models can create presegmentations for many structures in MRI scans. T1w MRI scans require an additional inversion step before segmenting with a CT model. Results were consistent for a large multiclass model (i.e., TotalSegmentator) and a smaller model for renal cell carcinoma.

Mixed Modality Segmentation Abdominal Retrospective Clinical In Silico Academic Lab Benchmark SOTA

A New Method of Modeling the Multi-stage Decision-Making Process of CRT Using Machine Learning with Uncertainty Quantification.

Larsen K, Zhao C, He Z, Keyak J, Sha Q, Paez D, Zhang X, Hung GU, Zou J, Peix A, Zhou W

•papers•Sep 19 2025

Current machine learning-based (ML) models usually attempt to utilize all available patient data to predict patient outcomes while ignoring the associated cost and time for data acquisition. The purpose of this study is to create a multi-stage ML model to predict cardiac resynchronization therapy (CRT) response for heart failure (HF) patients. This model exploits uncertainty quantification to recommend additional collection of single-photon emission computed tomography myocardial perfusion imaging (SPECT MPI) variables if baseline clinical variables and features from electrocardiogram (ECG) are not sufficient. Two hundred eighteen patients who underwent rest-gated SPECT MPI were enrolled in this study. CRT response was defined as an increase in left ventricular ejection fraction (LVEF) > 5% at a 6 ± 1 month follow-up. A multi-stage ML model was created by combining two ensemble models: Ensemble 1 was trained with clinical variables and ECG; Ensemble 2 included Ensemble 1 plus SPECT MPI features. Uncertainty quantification from Ensemble 1 allowed for multi-stage decision-making to determine if the acquisition of SPECT data for a patient is necessary. The performance of the multi-stage model was compared with that of Ensemble models 1 and 2. The response rate for CRT was 55.5% (n = 121) with overall male gender 61.0% (n = 133), an average age of 62.0 ± 11.8, and LVEF of 27.7 ± 11.0. The multi-stage model performed similarly to Ensemble 2 (which utilized the additional SPECT data) with AUC of 0.75 vs. 0.77, accuracy of 0.71 vs. 0.69, sensitivity of 0.70 vs. 0.72, and specificity 0.72 vs. 0.65, respectively. However, the multi-stage model only required SPECT MPI data for 52.7% of the patients across all folds. By using rule-based logic stemming from uncertainty quantification, the multi-stage model was able to reduce the need for additional SPECT MPI data acquisition without significantly sacrificing performance.

SPECT Classification Cardiac Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Lightweight Transfer Learning Models for Multi-Class Brain Tumor Classification: Glioma, Meningioma, Pituitary Tumors, and No Tumor MRI Screening.

Gorenshtein A, Liba T, Goren A

•papers•Sep 19 2025

Glioma, pituitary tumors, and meningiomas constitute the major types of primary brain tumors. The challenge in achieving a definitive diagnosis stem from the brain's complex structure, limited accessibility for precise imaging, and the resemblance between different types of tumors. An alternative and promising solution is the application of artificial intelligence (AI), specifically through deep learning models. We developed multiple lightweight deep learning models ResNet-18 (both pretrained on ImageNet and trained from scratch), ResNet-34, ResNet-50, and a custom CNN to classify glioma, meningioma, pituitary tumor, and no tumor MRI scans. A dataset of 7023 images was employed, split into 5712 for training and 1311 for validation. Each model was evaluated via accuracy, area under the curve (AUC), sensitivity, specificity, and confusion matrices. We compared our models to SOTA methods such as SAlexNet and TumorGANet, highlighting computational efficiency and classification performance. ResNet pretrained achieved 98.5-99.2% accuracy and near-perfect validation metrics, with an overall AUC of 1.0 and average sensitivity and specificity both exceeding 97% across the four classes. In comparison, ResNet-18 trained from scratch and the custom CNN achieved 91.99% and 87.03% accuracy, respectively, with AUCs ranging from 0.94 to 1.00. Error analysis revealed moderate misclassification of meningiomas as gliomas in non-pretrained models. Learning rate optimization facilitated stable convergence, and loss metrics indicated effective generalization with minimal overfitting. Our findings confirm that a moderately sized, transfer-learned network (ResNet-18) can deliver high diagnostic accuracy and robust performance for four-class brain tumor classification. This approach aligns with the goal of providing efficient, accurate, and easily deployable AI solutions, particularly for smaller clinical centers with limited computational resources. Future studies should incorporate multi-sequence MRI and extended patient cohorts to further validate these promising results.

MRI Classification Neurological Methodology In Silico Academic Lab Benchmark SOTA

Hybrid-MedNet: a hybrid CNN-transformer network with multi-dimensional feature fusion for medical image segmentation.

Memon Y, Zeng F

•papers•Sep 19 2025

Twin-to-Twin Transfusion Syndrome (TTTS) is a complex prenatal condition in which monochorionic twins experience an imbalance in blood flow due to abnormal vascular connections in the shared placenta. Fetoscopic Laser Photocoagulation (FLP) is the first-line treatment for TTTS, aimed at coagulating these abnormal connections. However, the procedure is complicated by a limited field of view, occlusions, poor-quality endoscopic images, and distortions caused by artifacts. To optimize the visualization of placental vessels during surgical procedures, we propose Hybrid-MedNet, a novel hybrid CNN-transformer network that incorporates multi-dimensional deep feature learning techniques. The network introduces a BiPath Tokenization module that enhances vessel boundary detection by capturing both channel dependencies and spatial features through parallel attention mechanisms. A Context-Aware Transformer block addresses the weak inductive bias problem in traditional transformers while preserving spatial relationships crucial for accurate vessel identification in distorted fetoscopic images. Furthermore, we develop a Multi-Scale Trifusion Module that integrates multi-dimensional features to capture rich vascular representations from the encoder and facilitate precise vessel information transfer to the decoder for improved segmentation accuracy. Experimental results show that our approach achieves a Dice score of 95.40% on fetoscopic images, outperforming 10 state-of-the-art segmentation methods. The consistent superior performance across four segmentation tasks and ten distinct datasets confirms the robustness and effectiveness of our method for diverse and complex medical imaging applications.

Fluoroscopy Segmentation Abdominal Methodology In Silico Academic Lab Benchmark SOTA

MFFC-Net: Multi-feature Fusion Deep Networks for Classifying Pulmonary Edema of a Pilot Study by Using Lung Ultrasound Image with Texture Analysis and Transfer Learning Technique.

Bui NT, Luoma CE, Zhang X

•papers•Sep 19 2025

Lung ultrasound (LUS) has been widely used by point-of-care systems in both children and adult populations to provide different clinical diagnostics. This research aims to develop an interpretable system that uses a deep fusion network for classifying LUS video/patients based on extracted features by using texture analysis and transfer learning techniques to assist physicians. The pulmonary edema dataset includes 56 LUS videos and 4234 LUS frames. The COVID-BLUES dataset includes 294 LUS videos and 15,826 frames. The proposed multi-feature fusion classification network (MFFC-Net) includes the following: (1) two features extracted from Inception-ResNet-v2, Inception-v3, and 9 texture features of gray-level co-occurrence matrix (GLCM) and histogram of the region of interest (ROI); (2) a neural network for classifying LUS images with feature fusion input; and (3) four models (i.e., ANN, SVM, XGBoost, and kNN) used for classifying COVID/NON COVID patients. The training process was evaluated based on accuracy (0.9969), F1-score (0.9968), sensitivity (0.9967), specificity (0.9990), and precision (0.9970) metrics after the fivefold cross-validation stage. The results of the ANOVA analysis with 9 features of LUS images show that there was a significant difference between pulmonary edema and normal lungs (p < 0.01). The test results at the frame level of the MFFC-Net model achieved an accuracy of 100% and ROC-AUC (1.000) compared with ground truth at the video level with 4 groups of LUS videos. Test results at the patient level with the COVID-BLUES dataset achieved the highest accuracy of 81.25% with the kNN model. The proposed MFFC-Net model has 125 times higher information density (ID) compared to Inception-ResNet-v2 and 53.2 times compared with Inception-v3.

Ultrasound Classification Chest Methodology In Silico Academic Lab Benchmark SOTA

Deep learning-based prediction of cardiopulmonary disease in retinal images of premature infants

Singh, P., Kumar, S., Tyagi, R., Young, B. K., Jordan, B. K., Scottoline, B., Evers, P. D., Ostmo, S., Coyner, A. S., Lin, W.-C., Gupta, A., Erdogmus, D., Chan, R. V. P., McCourt, E. A., Barry, J. S., McEvoy, C. T., Chiang, M. F., Campbell, J. P., Kalpathy-Cramer, J.

•preprint•Sep 19 2025

ImportanceBronchopulmonary dysplasia (BPD) and pulmonary hypertension (PH) are leading causes of morbidity and mortality in premature infants. ObjectiveTo determine whether images obtained as part of retinopathy of prematurity (ROP) screening might contain features associated with BPD and PH in infants, and whether a multi-modal model integrating imaging features with demographic risk factors might outperform a model based on demographic risk alone. DesignA deep learning model was used to study retinal images collected from patients enrolled in the multi-institutional Imaging and Informatics in Retinopathy of Prematurity (i-ROP) study. SettingSeven neonatal intensive care units. Participants493 infants at risk for ROP undergoing routine ROP screening examinations from 2012 to 2020. Images were limited to <=34 weeks post-menstrual age (PMA) so as to precede the clinical diagnosis of BPD or PH. ExposureBPD was diagnosed by the presence of an oxygen requirement at 36 weeks PMA, and PH was diagnosed by echocardiogram at 34 weeks. A support vector machine model was trained to predict BPD, or PH, diagnosis using: A) image features alone (extracted using Resnet18), B) demographics alone, C) image features concatenated with demographics. To reduce the possibility of confounding with ROP, secondary models were trained using only images without clinical signs of ROP. Main Outcome MeasureFor both BPD and PH, we report performance on a held-out testset (99 patients from the BPD cohort and 37 patients from the PH cohort), assessed by the area under receiver operating characteristic curve. ResultsFor BPD, the diagnostic accuracy of a multimodal model was 0.82 (95% CI: 0.72-0.90), compared to demographics 0.72 (0.60-0.82; P=0.07) or imaging 0.72 (0.61-0.82; P=0.002) alone. For PH, it was 0.91 (0.71-1.0) combined compared to 0.68 (0.43-0.9; P=0.04) for demographics and 0.91 (0.78-1.0; P=0.4) for imaging alone. These associations remained even when models were trained on the subset of images without any clinical signs of ROP. Conclusions and RelevanceRetinal images obtained during ROP screening can be used to predict the diagnosis of BPD and PH in preterm infants, which may lead to earlier diagnosis and avoid the need for invasive diagnostic testing in the future. KEY POINTSO_ST_ABSQuestionC_ST_ABSCan an artificial intelligence (AI) algorithm diagnose bronchopulmonary dysplasia (BPD) or pulmonary hypertension (PH) in retinal images in preterm infants obtained during retinopathy of prematurity (ROP) screening examinations? FindingsAI was able to predict the presence of both BPD and PH in retinal images with higher accuracy than what could be predicted based on baseline demographic risk alone. MeaningDeploying AI models using images obtained during retinopathy of prematurity screening could lead to earlier diagnosis and avoid the need for more invasive diagnostic testing.

OCT Classification Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Leveraging transfer learning from Acute Lymphoblastic Leukemia (ALL) pretraining to enhance Acute Myeloid Leukemia (AML) prediction

Duraiswamy, A., Harris-Birtill, D.

•preprint•Sep 19 2025

We overcome current limitations in Acute Myeloid Leukemia (AML) diagnosis by leveraging a transfer learning approach from Acute Lymphoblastic Leukemia (ALL) classification models, thus addressing the urgent need for more accurate and accessible AML diagnostic tools. AML has poorer prognosis than ALL, with a 5-year relative survival rate of only 17-19% compared to ALL survival rates of up to 75%, making early and accurate detection of AML paramount. Current diagnostic methods, rely heavily on manual microscopic examination, and are often subjective, time-consuming, and can suffer from inter-observer variability. While machine learning has shown promise in cancer classification, its application to AML detection, particularly leveraging the potential of transfer learning from related cancers like Acute Lymphoblastic Leukemia (ALL), remains underexplored. A comprehensive review of state-of-the-art advancements in acute lymphoblastic leukemia (ALL) and acute myeloid leukemia (AML) classification using deep learning algorithms is undertaken and key approaches are evaluated. The insights gained from this review inform the development of two novel machine learning pipelines designed to benchmark effectiveness of proposed transfer learning approaches. Five pre-trained models are fine-tuned using ALL training data (a novel approach in this context) to optimize their potential for AML classification. The result was the development of a best-in-class (BIC) model that surpasses current state-of-the-art (SOTA) performance in AML classification, advancing the accuracy of machine learning (ML)-driven cancer diagnostics. Author summaryAcute Myeloid Leukemia (AML) is an aggressive cancer with a poor prognosis. Early and accurate diagnosis is critical, but current methods are often subjective and time-consuming. We wanted to create a more accurate diagnostic tool by applying a technique called transfer learning from a similar cancer, Acute Lymphoblastic Leukemia (ALL). Two machine learning pipelines were developed. The first trained five different models on a large AML dataset to establish a baseline. The second pipeline first trained these models on an ALL dataset to "learn" from it before fine-tuning them on the AML data. Our experiments showed that the models that underwent transfer learning process consistently performed better than the models trained on AML data alone. The MobileNetV2 model, in particular, was the best-in-class, outperforming all other models and surpassing the best-reported metrics for AML classification in current literature. Our research demonstrates that transfer learning can enable highly accurate AML diagnostic models. The best-in-class model could potentially be used as a AML diagnostic tool, helping clinicians make faster and more accurate diagnoses, improving patient outcomes.

Mixed Modality Classification Methodology In Silico Benchmark SOTA

AI-Driven Multimodality Fusion in Cardiac Imaging: Integrating CT, MRI, and Echocardiography for Precision.

Tran HH, Thu A, Twayana AR, Fuertes A, Gonzalez M, Basta M, James M, Mehta KA, Elias D, Figaro YM, Islek D, Frishman WH, Aronow WS

•papers•Sep 19 2025

Artificial intelligence (AI)-enabled multimodal cardiovascular imaging holds significant promise for improving diagnostic accuracy, enhancing risk stratification, and supporting clinical decision-making. However, its translation into routine practice remains limited by multiple technical, infrastructural, and clinical barriers. This review synthesizes current challenges, including variability in image quality, alignment, and acquisition protocols; scarcity of large, annotated multimodality datasets; interoperability limitations across vendors and institutions; clinical skepticism due to limited prospective validation; and substantial development and implementation costs. Drawing from recent advances, we outline future research priorities to bridge the gap between technical feasibility and clinical utility. Key strategies include developing unified, vendor-agnostic AI models resilient to inter-institutional variability; integrating diverse data types such as genomics, wearable biosensors, and longitudinal clinical records; leveraging reinforcement learning for adaptive decision-support systems; and employing longitudinal imaging fusion for disease tracking and predictive analytics. We emphasize the need for rigorous prospective clinical trials, harmonized imaging standards, and collaborative data-sharing frameworks to ensure robust, equitable, and scalable deployment. Addressing these challenges through coordinated multidisciplinary efforts will be essential to realize the full potential of AI-driven multimodal cardiovascular imaging in advancing precision cardiovascular care.

Mixed Modality Registration Cardiac Review Concept Academic Lab Benchmark SOTA

AI-driven innovations for dental implant treatment planning: A systematic review.

Zaww K, Abbas H, Vanegas Sáenz JR, Hong G

•papers•Sep 19 2025

This systematic review evaluates the effectiveness of artificial intelligence (AI) models in dental implant treatment planning, focusing on: 1) identification, detection, and segmentation of anatomical structures; 2) technical assistance during treatment planning; and 3) additional relevant applications. A literature search of PubMed/MEDLINE, Scopus, and Web of Science was conducted for studies published in English until July 31, 2024. The included studies explored AI applications in implant treatment planning, excluding expert opinions, guidelines, and protocols. Three reviewers independently assessed study quality using the Joanna Briggs Institute (JBI) Critical Appraisal Checklist for Quasi-Experimental Studies, resolving disagreements by consensus. Of the 28 included studies, four were high, four were medium, and 20 were low quality according to the JBI scale. Eighteen studies on anatomical segmentation have demonstrated AI models with accuracy rates ranging from 66.4% to 99.1%. Eight studies examined AI's role in technical assistance for surgical planning, demonstrating its potential in predicting jawbone mineral density, optimizing drilling protocols, and classifying plans for maxillary sinus augmentation. One study indicated a learning curve for AI in implant planning, recommending at least 50 images for over 70% predictive accuracy. Another study reported 83% accuracy in localizing stent markers for implant sites, suggesting additional imaging planes to address a 17% miss rate and 2.8% false positives. AI models exhibit potential for automating dental implant planning with high accuracy in anatomical segmentation and insightful technical assistance. However, further well-designed studies with standardized evaluation parameters are required for pragmatic integration into clinical settings.

CT Segmentation Review In Silico Academic Lab Benchmark SOTA

ENSAM: an efficient foundation model for interactive segmentation of 3D medical images

Elias Stenhede, Agnar Martin Bjørnstad, Arian Ranjbar

•preprint•Sep 19 2025

We present ENSAM (Equivariant, Normalized, Segment Anything Model), a lightweight and promptable model for universal 3D medical image segmentation. ENSAM combines a SegResNet-based encoder with a prompt encoder and mask decoder in a U-Net-style architecture, using latent cross-attention, relative positional encoding, normalized attention, and the Muon optimizer for training. ENSAM is designed to achieve good performance under limited data and computational budgets, and is trained from scratch on under 5,000 volumes from multiple modalities (CT, MRI, PET, ultrasound, microscopy) on a single 32 GB GPU in 6 hours. As part of the CVPR 2025 Foundation Models for Interactive 3D Biomedical Image Segmentation Challenge, ENSAM was evaluated on hidden test set with multimodal 3D medical images, obtaining a DSC AUC of 2.404, NSD AUC of 2.266, final DSC of 0.627, and final NSD of 0.597, outperforming two previously published baseline models (VISTA3D, SAM-Med3D) and matching the third (SegVol), surpassing its performance in final DSC but trailing behind in the other three metrics. In the coreset track of the challenge, ENSAM ranks 5th of 10 overall and best among the approaches not utilizing pretrained weights. Ablation studies confirm that our use of relative positional encodings and the Muon optimizer each substantially speed up convergence and improve segmentation quality.

Mixed Modality Segmentation Whole Body Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Filter Papers

Tags

MRI annotation using an inversion-based preprocessing for CT model adaptation.

A New Method of Modeling the Multi-stage Decision-Making Process of CRT Using Machine Learning with Uncertainty Quantification.

Lightweight Transfer Learning Models for Multi-Class Brain Tumor Classification: Glioma, Meningioma, Pituitary Tumors, and No Tumor MRI Screening.

Hybrid-MedNet: a hybrid CNN-transformer network with multi-dimensional feature fusion for medical image segmentation.

MFFC-Net: Multi-feature Fusion Deep Networks for Classifying Pulmonary Edema of a Pilot Study by Using Lung Ultrasound Image with Texture Analysis and Transfer Learning Technique.

Deep learning-based prediction of cardiopulmonary disease in retinal images of premature infants

Leveraging transfer learning from Acute Lymphoblastic Leukemia (ALL) pretraining to enhance Acute Myeloid Leukemia (AML) prediction

AI-Driven Multimodality Fusion in Cardiac Imaging: Integrating CT, MRI, and Echocardiography for Precision.

AI-driven innovations for dental implant treatment planning: A systematic review.

ENSAM: an efficient foundation model for interactive segmentation of 3D medical images

Ready to Sharpen Your Edge?