Latest Papers on Radiology AI. Tags: Benchmark SOTA, Order: Best Match, Limit: 10.

Just Say Better or Worse: A Human-AI Collaborative Framework for Medical Image Segmentation Without Manual Annotations

Yizhe Zhang

•preprint•Jul 8 2025

Manual annotation of medical images is a labor-intensive and time-consuming process, posing a significant bottleneck in the development and deployment of robust medical imaging AI systems. This paper introduces a novel Human-AI collaborative framework for medical image segmentation that substantially reduces the annotation burden by eliminating the need for explicit manual pixel-level labeling. The core innovation lies in a preference learning paradigm, where human experts provide minimal, intuitive feedback -- simply indicating whether an AI-generated segmentation is better or worse than a previous version. The framework comprises four key components: (1) an adaptable foundation model (FM) for feature extraction, (2) label propagation based on feature similarity, (3) a clicking agent that learns from human better-or-worse feedback to decide where to click and with which label, and (4) a multi-round segmentation learning procedure that trains a state-of-the-art segmentation network using pseudo-labels generated by the clicking agent and FM-based label propagation. Experiments on three public datasets demonstrate that the proposed approach achieves competitive segmentation performance using only binary preference feedback, without requiring experts to directly manually annotate the images.

Mixed Modality Segmentation Methodology In Silico Academic Lab Benchmark SOTA

Mamba Goes HoME: Hierarchical Soft Mixture-of-Experts for 3D Medical Image Segmentation

Szymon Płotka, Maciej Chrabaszcz, Gizem Mert, Ewa Szczurek, Arkadiusz Sitek

•preprint•Jul 8 2025

In recent years, artificial intelligence has significantly advanced medical image segmentation. However, challenges remain, including efficient 3D medical image processing across diverse modalities and handling data variability. In this work, we introduce Hierarchical Soft Mixture-of-Experts (HoME), a two-level token-routing layer for efficient long-context modeling, specifically designed for 3D medical image segmentation. Built on the Mamba state-space model (SSM) backbone, HoME enhances sequential modeling through sparse, adaptive expert routing. The first stage employs a Soft Mixture-of-Experts (SMoE) layer to partition input sequences into local groups, routing tokens to specialized per-group experts for localized feature extraction. The second stage aggregates these outputs via a global SMoE layer, enabling cross-group information fusion and global context refinement. This hierarchical design, combining local expert routing with global expert refinement improves generalizability and segmentation performance, surpassing state-of-the-art results across datasets from the three most commonly used 3D medical imaging modalities and data quality.

Mixed Modality Segmentation Methodology In Silico Benchmark SOTA

Integrating radiomic texture analysis and deep learning for automated myocardial infarction detection in cine-MRI.

Xu W, Shi X

•papers•Jul 8 2025

Robust differentiation between infarcted and normal myocardial tissue is essential for improving diagnostic accuracy and personalizing treatment in myocardial infarction (MI). This study proposes a hybrid framework combining radiomic texture analysis with deep learning-based segmentation to enhance MI detection on non-contrast cine cardiac magnetic resonance (CMR) imaging.The approach incorporates radiomic features derived from the Gray-Level Co-Occurrence Matrix (GLCM) and Gray-Level Run Length Matrix (GLRLM) methods into a modified U-Net segmentation network. A three-stage feature selection pipeline was employed, followed by classification using multiple machine learning models. Early and intermediate fusion strategies were integrated into the hybrid architecture. The model was validated on cine-CMR data from the SCD and Kaggle datasets.Joint Entropy, Max Probability, and RLNU emerged as the most discriminative features, with Joint Entropy achieving the highest AUC (0.948). The hybrid model outperformed standalone U-Net in segmentation (Dice = 0.887, IoU = 0.803, HD95 = 4.48 mm) and classification (accuracy = 96.30%, AUC = 0.97, precision = 0.96, recall = 0.94, F1-score = 0.96). Dimensionality reduction via PCA and t-SNE confirmed distinct class separability. Correlation coefficients (r = 0.95-0.98) and Bland-Altman plots demonstrated high agreement between predicted and reference infarct sizes.Integrating radiomic features into a deep learning segmentation pipeline improves MI detection and interpretability in cine-CMR. This scalable and explainable hybrid framework holds potential for broader applications in multimodal cardiac imaging and automated myocardial tissue characterization.

MRI Segmentation Cardiac Methodology In Silico Benchmark SOTA

Capsule-ConvKAN: A Hybrid Neural Approach to Medical Image Classification

Laura Pituková, Peter Sinčák, László József Kovács

•preprint•Jul 8 2025

This study conducts a comprehensive comparison of four neural network architectures: Convolutional Neural Network, Capsule Network, Convolutional Kolmogorov--Arnold Network, and the newly proposed Capsule--Convolutional Kolmogorov--Arnold Network. The proposed Capsule-ConvKAN architecture combines the dynamic routing and spatial hierarchy capabilities of Capsule Network with the flexible and interpretable function approximation of Convolutional Kolmogorov--Arnold Networks. This novel hybrid model was developed to improve feature representation and classification accuracy, particularly in challenging real-world biomedical image data. The architectures were evaluated on a histopathological image dataset, where Capsule-ConvKAN achieved the highest classification performance with an accuracy of 91.21\%. The results demonstrate the potential of the newly introduced Capsule-ConvKAN in capturing spatial patterns, managing complex features, and addressing the limitations of traditional convolutional models in medical image classification.

OCT Classification Methodology In Silico Academic Lab Benchmark SOTA

AI-enhanced patient-specific dosimetry in I-131 planar imaging with a single oblique view.

Jalilifar M, Sadeghi M, Emami-Ardekani A, Bitarafan-Rajabi A, Geravand K, Geramifar P

•papers•Jul 8 2025

This study aims to enhance the dosimetry accuracy in 131I planar imaging by utilizing a single oblique view and Monte Carlo (MC) validated dose point kernels (DPKs) alongside the integration of artificial intelligence (AI) for accurate dose prediction within planar imaging. Forty patients with thyroid cancers post-thyroidectomy surgery and 30 with neuroendocrine tumors underwent planar and SPECT/CT imaging. Using whole-body (WB) planar images with an additional oblique view, organ thicknesses were estimated. DPKs and organ-specific S-values were used to estimate the absorbed doses. Four AI algorithms- multilayer perceptron (MLP), linear regression, support vector regression model, decision tree, convolution neural network, and U-Net were used for dose estimation. Planar image counts, body thickness, patient BMI, age, S-values, and tissue attenuation coefficients were imported as input into the AI algorithm. To provide the ground truth, the CT-based segmentation generated binary masks for each organ, and the corresponding SPECT images were used for GATE MC dosimetry. The MLP-predicted dose values across all organs represented superior performance with the lowest mean absolute error in the liver but higher in the spleen and salivary glands. Notably, MLP-based dose estimations closely matched ground truth data with < 15% differences in most tissues. The MLP-estimated dose values present a robust patient-specific dosimetry approach capable of swiftly predicting absorbed doses in different organs using WB planar images and a single oblique view. This approach facilitates the implementation of 2D planar imaging as a pre-therapeutic technique for a more accurate assessment of the administrated activity.

SPECT Registration Whole Body Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Deep supervised transformer-based noise-aware network for low-dose PET denoising across varying count levels.

Azimi MS, Felfelian V, Zeraatkar N, Dadgar H, Arabi H, Zaidi H

•papers•Jul 8 2025

Reducing radiation dose from PET imaging is essential to minimize cancer risks; however, it often leads to increased noise and degraded image quality, compromising diagnostic reliability. Recent advances in deep learning have shown promising results in addressing these limitations through effective denoising. However, existing networks trained on specific noise levels often fail to generalize across diverse acquisition conditions. Moreover, training multiple models for different noise levels is impractical due to data and computational constraints. This study aimed to develop a supervised Swin Transformer-based unified noise-aware (ST-UNN) network that handles diverse noise levels and reconstructs high-quality images in low-dose PET imaging. We present a Swin Transformer-based Noise-Aware Network (ST-UNN), which incorporates multiple sub-networks, each designed to address specific noise levels ranging from 1 % to 10 %. An adaptive weighting mechanism dynamically integrates the outputs of these sub-networks to achieve effective denoising. The model was trained and evaluated using PET/CT dataset encompassing the entire head and malignant lesions in the head and neck region. Performance was assessed using a combination of structural and statistical metrics, including the Structural Similarity Index (SSIM), Peak Signal-to-Noise Ratio (PSNR), Standardized Uptake Value (SUV) mean bias, SUVmax bias, and Root Mean Square Error (RMSE). This comprehensive evaluation ensured reliable results for both global and localized regions within PET images. The ST-UNN consistently outperformed conventional networks, particularly in ultra-low-dose scenarios. At 1 % count level, it achieved a PSNR of 34.77, RMSE of 0.05, and SSIM of 0.97, notably surpassing the baseline networks. It also achieved the lowest SUVmean bias (0.08) and RMSE lesion (0.12) at this level. Across all count levels, ST-UNN maintained high performance and low error, demonstrating strong generalization and diagnostic integrity. ST-UNN offers a scalable, transformer-based solution for low-dose PET imaging. By dynamically integrating sub-networks, it effectively addresses noise variability and provides superior image quality, thereby advancing the capabilities of low-dose and dynamic PET imaging.

PET Reconstruction Neurological Methodology In Silico Academic Lab Reproducibility Benchmark SOTA

Automated instance segmentation and registration of spinal vertebrae from CT-Scans with an improved 3D U-net neural network and corner point registration.

Hill J, Khokher MR, Nguyen C, Adcock M, Li R, Anderson S, Morrell T, Diprose T, Salvado O, Wang D, Tay GK

•papers•Jul 8 2025

This paper presents a rapid and robust approach for 3D volumetric segmentation, labelling, and registration of human spinal vertebrae from CT scans using an optimised and improved 3D U-Net neural network architecture. The network is designed by incorporating residual and dense interconnections, followed by an extensive evaluation of different network setups by optimising the network components like activation functions, optimisers, and pooling operations. In addition, the network architecture is optimised for varying numbers of convolution layers per block and U-Net levels with fixed and cascading numbers of filters. For 3D virtual reality visualisation, the segmentation output of the improved 3D U-Net network is registered with the original scans through a corner point registration process. The registration takes into account the spatial coordinates of each segmented vertebra as a 3D volume and eight virtual fiducial markers to ensure alignment in all rotational planes. Trained on the VerSe'20 dataset, the proposed pipeline achieves a Dice score coefficient of 92.38% for vertebrae instance segmentation and a Hausdorff distance of 5.26 mm for vertebrae localisation on the VerSe'20 public test dataset, which outperforms many existing methods that participated in the VerSe'20 challenge. Integrated with Singular Health's MedVR software for virtual reality visualisation, the proposed solution has been deployed on standard edge-computing hardware in medical institutions. Depending on the scan size, the deployed solution takes between 90 and 210 s to label and segment vertebrae, including the cervical vertebrae. It is hoped that the acceleration of the segmentation and registration process will facilitate the easier preparation of future training datasets and benefit pre-surgical visualisation and planning.

CT Segmentation Musculoskeletal Methodology In Silico Startup Benchmark SOTA

Vision Transformers-Based Deep Feature Generation Framework for Hydatid Cyst Classification in Computed Tomography Images.

Sagik M, Gumus A

•papers•Jul 8 2025

Hydatid cysts, caused by Echinococcus granulosus, form progressively enlarging fluid-filled cysts in organs like the liver and lungs, posing significant public health risks through severe complications or death. This study presents a novel deep feature generation framework utilizing vision transformer models (ViT-DFG) to enhance the classification accuracy of hydatid cyst types. The proposed framework consists of four phases: image preprocessing, feature extraction using vision transformer models, feature selection through iterative neighborhood component analysis, and classification, where the performance of the ViT-DFG model was evaluated and compared across different classifiers such as k-nearest neighbor and multi-layer perceptron (MLP). Both methods were evaluated independently to assess classification performance from different approaches. The dataset, comprising five cyst types, was analyzed for both five-class and three-class classification by grouping the cyst types into active, transition, and inactive categories. Experimental results showed that the proposed VIT-DFG method achieves higher accuracy than existing methods. Specifically, the ViT-DFG framework attained an overall classification accuracy of 98.10% for the three-class and 95.12% for the five-class classifications using 5-fold cross-validation. Statistical analysis through one-way analysis of variance (ANOVA), conducted to evaluate significant differences between models, confirmed significant differences between the proposed framework and individual vision transformer models ( <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>p</mi> <mo><</mo> <mn>0.05</mn></mrow> </math> ). These results highlight the effectiveness of combining multiple vision transformer architectures with advanced feature selection techniques in improving classification performance. The findings underscore the ViT-DFG framework's potential to advance medical image analysis, particularly in hydatid cyst classification, while offering clinical promise through automated diagnostics and improved decision-making.

CT Classification Abdominal Methodology In Silico Academic Lab Benchmark SOTA

Integrating Machine Learning into Myositis Research: a Systematic Review.

Juarez-Gomez C, Aguilar-Vazquez A, Gonzalez-Gauna E, Garcia-Ordoñez GP, Martin-Marquez BT, Gomez-Rios CA, Becerra-Jimenez J, Gaspar-Ruiz A, Vazquez-Del Mercado M

•papers•Jul 8 2025

Idiopathic inflammatory myopathies (IIM) are a group of autoimmune rheumatic diseases characterized by proximal muscle weakness and extra muscular manifestations. Since 1975, these IIM have been classified into different clinical phenotypes. Each clinical phenotype is associated with a better or worse prognosis and a particular physiopathology. Machine learning (ML) is a fascinating field of knowledge with worldwide applications in different fields. In IIM, ML is an emerging tool assessed in very specific clinical contexts as a complementary tool for research purposes, including transcriptome profiles in muscle biopsies, differential diagnosis using magnetic resonance imaging (MRI), and ultrasound (US). With the cancer-associated risk and predisposing factors for interstitial lung disease (ILD) development, this systematic review evaluates 23 original studies using supervised learning models, including logistic regression (LR), random forest (RF), support vector machines (SVM), and convolutional neural networks (CNN), with performance assessed primarily through the area under the curve coupled with the receiver operating characteristic (AUC-ROC).

Mixed Modality Classification Musculoskeletal Review Concept Academic Lab Benchmark SOTA

An autonomous agent for auditing and improving the reliability of clinical AI models

Lukas Kuhn, Florian Buettner

•preprint•Jul 8 2025

The deployment of AI models in clinical practice faces a critical challenge: models achieving expert-level performance on benchmarks can fail catastrophically when confronted with real-world variations in medical imaging. Minor shifts in scanner hardware, lighting or demographics can erode accuracy, but currently reliability auditing to identify such catastrophic failure cases before deployment is a bespoke and time-consuming process. Practitioners lack accessible and interpretable tools to expose and repair hidden failure modes. Here we introduce ModelAuditor, a self-reflective agent that converses with users, selects task-specific metrics, and simulates context-dependent, clinically relevant distribution shifts. ModelAuditor then generates interpretable reports explaining how much performance likely degrades during deployment, discussing specific likely failure modes and identifying root causes and mitigation strategies. Our comprehensive evaluation across three real-world clinical scenarios - inter-institutional variation in histopathology, demographic shifts in dermatology, and equipment heterogeneity in chest radiography - demonstrates that ModelAuditor is able correctly identify context-specific failure modes of state-of-the-art models such as the established SIIM-ISIC melanoma classifier. Its targeted recommendations recover 15-25% of performance lost under real-world distribution shift, substantially outperforming both baseline models and state-of-the-art augmentation methods. These improvements are achieved through a multi-agent architecture and execute on consumer hardware in under 10 minutes, costing less than US$0.50 per audit.

X-Ray Classification Chest Methodology In Silico Academic Lab Benchmark SOTA Reproducibility

Just Say Better or Worse: A Human-AI Collaborative Framework for Medical Image Segmentation Without Manual Annotations

Mamba Goes HoME: Hierarchical Soft Mixture-of-Experts for 3D Medical Image Segmentation

Integrating radiomic texture analysis and deep learning for automated myocardial infarction detection in cine-MRI.

Capsule-ConvKAN: A Hybrid Neural Approach to Medical Image Classification

AI-enhanced patient-specific dosimetry in I-131 planar imaging with a single oblique view.

Deep supervised transformer-based noise-aware network for low-dose PET denoising across varying count levels.

Automated instance segmentation and registration of spinal vertebrae from CT-Scans with an improved 3D U-net neural network and corner point registration.

Vision Transformers-Based Deep Feature Generation Framework for Hydatid Cyst Classification in Computed Tomography Images.

Integrating Machine Learning into Myositis Research: a Systematic Review.

An autonomous agent for auditing and improving the reliability of clinical AI models

Ready to Sharpen Your Edge?