Latest Papers on Radiology AI. Tags: Segmentation

Novel BDefRCNLSTM: an efficient ensemble deep learning approaches for enhanced brain tumor detection and categorization with segmentation.

Janapati M, Akthar S

•papers•Sep 11 2025

Brain tumour detection and classification are critical for improving patient prognosis and treatment planning. However, manual identification from magnetic resonance imaging (MRI) scans is time-consuming, error-prone, and reliant on expert interpretation. The increasing complexity of tumour characteristics necessitates automated solutions to enhance accuracy and efficiency. This study introduces a novel ensemble deep learning model, boosted deformable and residual convolutional network with bi-directional convolutional long short-term memory (BDefRCNLSTM), for the classification and segmentation of brain tumours. The proposed framework integrates entropy-based local binary pattern (ELBP) for extracting spatial semantic features and employs the enhanced sooty tern optimisation (ESTO) algorithm for optimal feature selection. Additionally, an improved X-Net model is utilised for precise segmentation of tumour regions. The model is trained and evaluated on Figshare, Brain MRI, and Kaggle datasets using multiple performance metrics. Experimental results demonstrate that the proposed BDefRCNLSTM model achieves over 99% accuracy in both classification and segmentation, outperforming existing state-of-the-art approaches. The findings establish the proposed approach as a clinically viable solution for automated brain tumour diagnosis. The integration of optimised feature selection and advanced segmentation techniques improves diagnostic accuracy, potentially assisting radiologists in making faster and more reliable decisions.

MRI Segmentation Neurological Methodology In Silico Academic Lab Benchmark SOTA

Ultrasam: a foundation model for ultrasound using large open-access segmentation datasets.

Meyer A, Murali A, Zarin F, Mutter D, Padoy N

•papers•Sep 11 2025

Automated ultrasound (US) image analysis remains a longstanding challenge due to anatomical complexity and the scarcity of annotated data. Although large-scale pretraining has improved data efficiency in many visual domains, its impact in US is limited by a pronounced domain shift from other imaging modalities and high variability across clinical applications, such as chest, ovarian, and endoscopic imaging. To address this, we propose UltraSam, a SAM-style model trained on a heterogeneous collection of publicly available segmentation datasets, originally developed in isolation. UltraSam is trained under the prompt-conditioned segmentation paradigm, which eliminates the need for unified labels and enables generalization to a broad range of downstream tasks. We compile US-43d, a large-scale collection of 43 open-access US datasets comprising over 282,000 images with segmentation masks covering 58 anatomical structures. We explore adaptation and fine-tuning strategies for SAM and systematically evaluate transferability across downstream tasks, comparing against state-of-the-art pretraining methods. We further propose prompted classification, a new use case where object-specific prompts and image features are jointly decoded to improve classification performance. In experiments on three diverse public US datasets, UltraSam outperforms existing SAM variants on prompt-based segmentation and surpasses self-supervised US foundation models on downstream (prompted) classification and instance segmentation tasks. UltraSam demonstrates that SAM-style training on diverse, sparsely annotated US data enables effective generalization across tasks. By unlocking the value of fragmented public datasets, our approach lays the foundation for scalable, real-world US representation learning. We release our code and pretrained models at https://github.com/CAMMA-public/UltraSam and invite the community to further this effort by continuing to contribute high-quality datasets.

Ultrasound Segmentation Methodology In Silico Academic Lab Open Dataset Open Code Benchmark SOTA

A full-scale attention-augmented CNN-transformer model for segmentation of oropharyngeal mucosa organs-at-risk in radiotherapy.

He L, Sun J, Lu S, Li J, Wang X, Yan Z, Guan J

•papers•Sep 11 2025

Radiation-induced oropharyngeal mucositis (ROM) is a common and severe side effect of radiotherapy in nasopharyngeal cancer patients, leading to significant clinical complications such as malnutrition, infections, and treatment interruptions. Accurate delineation of the oropharyngeal mucosa (OPM) as an organ-at-risk (OAR) is crucial to minimizing radiation exposure and preventing ROM. This study aims to develop and validate an advanced automatic segmentation model, attention-augmented Swin U-Net transformer (AA-Swin UNETR), for accurate delineation of OPM to improve radiotherapy planning and reduce the incidence of ROM. We proposed a hybrid CNN-transformer model, AA-Swin UNETR, based on the Swin UNETR framework, which integrates hierarchical feature extraction with full-scale attention mechanisms. The model includes a Swin Transformer-based encoder and a CNN-based decoder with residual blocks, connected via a full-scale feature connection scheme. The full-scale attention mechanism enables the model to capture long-range dependencies and multi-level features effectively, enhancing the segmentation accuracy. The model was trained on a dataset of 202 CT scans from Nanfang Hospital, using expert manual delineations as the gold standard. We evaluated the performance of AA-Swin UNETR against state-of-the-art (SOTA) segmentation models, including Swin UNETR, nnUNet, and 3D UX-Net, using geometric and dosimetric evaluation parameters. The geometric metrics include Dice similarity coefficient (DSC), surface DSC (sDSC), volume similarity (VS), Hausdorff distance (HD), precision, and recall. The dosimetric metrics include changes of D0.1 cc and Dmean between results derived from manually delineated OPM and auto-segmentation models. The AA-Swin UNETR model achieved the highest mean DSC of 87.72 ± 1.98%, significantly outperforming Swin UNETR (83.53 ± 2.59%), nnUNet (85.48%± 2.68), and 3D UX-Net (80.04 ± 3.76%). The model also showed superior mean sDSC (98.44 ± 1.08%), mean VS (97.86 ± 1.43%), mean precision (87.60 ± 3.06%) and mean recall (89.22 ± 2.70%), with a competitive mean HD of 9.03 ± 2.79 mm. For dosimetric evaluation, the proposed model generates smallest mean [Formula: see text] (0.46 ± 4.92 cGy) and mean [Formula: see text] (6.26 ± 24.90 cGY) relative to manual delineation compared with other auto-segmentation results (mean [Formula: see text] of Swin UNETR = -0.56 ± 7.28 cGy, nnUNet = 0.99 ± 4.73 cGy, 3D UX-Net = -0.65 ± 8.05 cGy; mean [Formula: see text] of Swin UNETR = 7.46 ± 43.37, nnUNet = 21.76 ± 37.86 and 3D UX-Net = 44.61 ± 62.33). In this paper, we proposed a transformer and CNN hybrid deep-learning based model AA-Swin UNETR for automatic segmentation of OPM as an OAR structure in radiotherapy planning. Evaluations with geometric and dosimetric parameters demonstrated AA-Swin UNETR can generate delineations close to a manual reference, both in terms of geometry and dose-volume metrics. The proposed model out-performed existing SOTA models in both evaluation metrics and demonstrated is capability of accurately segmenting complex anatomical structures of the OPM, providing a reliable tool for enhancing radiotherapy planning.

CT Segmentation Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Resource-Efficient Glioma Segmentation on Sub-Saharan MRI

Freedmore Sidume, Oumayma Soula, Joseph Muthui Wacira, YunFei Zhu, Abbas Rabiu Muhammad, Abderrazek Zeraii, Oluwaseun Kalejaye, Hajer Ibrahim, Olfa Gaddour, Brain Halubanza, Dong Zhang, Udunna C Anazodo, Confidence Raymond

•preprint•Sep 11 2025

Gliomas are the most prevalent type of primary brain tumors, and their accurate segmentation from MRI is critical for diagnosis, treatment planning, and longitudinal monitoring. However, the scarcity of high-quality annotated imaging data in Sub-Saharan Africa (SSA) poses a significant challenge for deploying advanced segmentation models in clinical workflows. This study introduces a robust and computationally efficient deep learning framework tailored for resource-constrained settings. We leveraged a 3D Attention UNet architecture augmented with residual blocks and enhanced through transfer learning from pre-trained weights on the BraTS 2021 dataset. Our model was evaluated on 95 MRI cases from the BraTS-Africa dataset, a benchmark for glioma segmentation in SSA MRI data. Despite the limited data quality and quantity, our approach achieved Dice scores of 0.76 for the Enhancing Tumor (ET), 0.80 for Necrotic and Non-Enhancing Tumor Core (NETC), and 0.85 for Surrounding Non-Functional Hemisphere (SNFH). These results demonstrate the generalizability of the proposed model and its potential to support clinical decision making in low-resource settings. The compact architecture, approximately 90 MB, and sub-minute per-volume inference time on consumer-grade hardware further underscore its practicality for deployment in SSA health systems. This work contributes toward closing the gap in equitable AI for global health by empowering underserved regions with high-performing and accessible medical imaging solutions.

MRI Segmentation Neurological Methodology In Silico Benchmark SOTA

Medverse: A Universal Model for Full-Resolution 3D Medical Image Segmentation, Transformation and Enhancement

Jiesi Hu, Jianfeng Cao, Yanwu Yang, Chenfei Ye, Yixuan Zhang, Hanyang Peng, Ting Ma

•preprint•Sep 11 2025

In-context learning (ICL) offers a promising paradigm for universal medical image analysis, enabling models to perform diverse image processing tasks without retraining. However, current ICL models for medical imaging remain limited in two critical aspects: they cannot simultaneously achieve high-fidelity predictions and global anatomical understanding, and there is no unified model trained across diverse medical imaging tasks (e.g., segmentation and enhancement) and anatomical regions. As a result, the full potential of ICL in medical imaging remains underexplored. Thus, we present \textbf{Medverse}, a universal ICL model for 3D medical imaging, trained on 22 datasets covering diverse tasks in universal image segmentation, transformation, and enhancement across multiple organs, imaging modalities, and clinical centers. Medverse employs a next-scale autoregressive in-context learning framework that progressively refines predictions from coarse to fine, generating consistent, full-resolution volumetric outputs and enabling multi-scale anatomical awareness. We further propose a blockwise cross-attention module that facilitates long-range interactions between context and target inputs while preserving computational efficiency through spatial sparsity. Medverse is extensively evaluated on a broad collection of held-out datasets covering previously unseen clinical centers, organs, species, and imaging modalities. Results demonstrate that Medverse substantially outperforms existing ICL baselines and establishes a novel paradigm for in-context learning. Code and model weights will be made publicly available. Our model are publicly available at https://github.com/jiesihu/Medverse.

Mixed Modality Segmentation Whole Body Methodology In Silico Academic Lab Open Code Benchmark SOTA

Enhancing 3D Medical Image Understanding with Pretraining Aided by 2D Multimodal Large Language Models

Qiuhui Chen, Xuancheng Yao, Huping Ye, Yi Hong

•preprint•Sep 11 2025

Understanding 3D medical image volumes is critical in the medical field, yet existing 3D medical convolution and transformer-based self-supervised learning (SSL) methods often lack deep semantic comprehension. Recent advancements in multimodal large language models (MLLMs) provide a promising approach to enhance image understanding through text descriptions. To leverage these 2D MLLMs for improved 3D medical image understanding, we propose Med3DInsight, a novel pretraining framework that integrates 3D image encoders with 2D MLLMs via a specially designed plane-slice-aware transformer module. Additionally, our model employs a partial optimal transport based alignment, demonstrating greater tolerance to noise introduced by potential noises in LLM-generated content. Med3DInsight introduces a new paradigm for scalable multimodal 3D medical representation learning without requiring human annotations. Extensive experiments demonstrate our state-of-the-art performance on two downstream tasks, i.e., segmentation and classification, across various public datasets with CT and MRI modalities, outperforming current SSL methods. Med3DInsight can be seamlessly integrated into existing 3D medical image understanding networks, potentially enhancing their performance. Our source code, generated datasets, and pre-trained models will be available at https://github.com/Qybc/Med3DInsight.

Mixed Modality Segmentation Methodology In Silico Academic Lab Open Code Benchmark SOTA

Training With Local Data Remains Important for Deep Learning MRI Prostate Cancer Detection.

Carere SG, Jewell J, Nasute Fauerbach PV, Emerson DB, Finelli A, Ghai S, Haider MA

•papers•Sep 11 2025

Domain shift has been shown to have a major detrimental effect on AI model performance however prior studies on domain shift for MRI prostate cancer segmentation have been limited to small, or heterogenous cohorts. Our objective was to assess whether prostate cancer segmentation models trained on local MRI data continue to outperform those trained on external data with cohorts exceeding 1000. We simulated a multi-institutional consortium using the public PICAI dataset (PICAI-TRAIN: 1241 exams, PICAI-TEST: 259) and a local dataset (LOCAL-TRAIN: 1400 exams, LOCAL-TEST: 308). IRB approval was obtained and consent waived. We compared nnUNet-v2 models trained on the combined data (CENTRAL-TRAIN) and separately on PICAI-TRAIN and LOCAL-TRAIN. Accuracy was evaluated using the open source PICAI Score on LOCAL-TEST. Significance was tested using bootstrapping. Just 22% (309/1400) of LOCAL-TRAIN exams would be sufficient to match the performance of a model trained on PICAI-TRAIN. The CENTRAL-TRAIN performance was similar to LOCAL-TRAIN performance, with PICAI Scores [95% CI] of 65 [58-71] and 66 [60-72], respectively. Both of these models exceeded the model trained on PICAI-TRAIN alone which had a score of 58 [51-64] (P < .002). Reducing training set size did not alter these relative trends. Domain shift limits MRI prostate cancer segmentation performance even when training with over 1000 exams from 3 external institutions. Use of local data is paramount at these scales.

MRI Segmentation Abdominal Retrospective Clinical In Silico Academic Lab Benchmark SOTA Open Dataset

U-ConvNext: A Robust Approach to Glioma Segmentation in Intraoperative Ultrasound.

Vahdani AM, Rahmani M, Pour-Rashidi A, Ahmadian A, Farnia P

•papers•Sep 11 2025

Intraoperative tumor imaging is critical to achieving maximal safe resection during neurosurgery, especially for low-grade glioma resection. Given the convenience of ultrasound as an intraoperative imaging modality, but also the limitations of the ultrasound modality and the time-consuming process of manual tumor segmentation, we propose a learning-based model for the accurate segmentation of low-grade gliomas in ultrasound images. We developed a novel U-net-based architecture adopting the block architecture of the ConvNext V2 model, titled U-ConvNext, which also incorporates various architectural improvements including global response normalization, fine-tuned kernel sizes, and inception layers. We also adopted the CutMix data augmentation technique for semantic segmentation, aiming for enhanced texture detection. Conformal segmentation, a novel approach to conformal prediction for binary semantic segmentation, was also developed for uncertainty quantification, providing calibrated measures of model uncertainty in a visual format. The proposed models were trained and evaluated on three subsets of images in the RESECT dataset and achieved hold-out test Dice scores of 84.63%, 74.52%, and 90.82% on the "before," "during," and "after" subsets, respectively, which indicates increases of ~ 13-31% compared to the state of the art. Furthermore, external evaluation on the ReMIND dataset indicated a robust performance (dice score of 79.17% [95% CI: 77.82-81.62] and only a moderate decline of < 3% in expected calibration error. Our approach integrates various innovations in model design, model training, and uncertainty quantification, achieving improved results on the segmentation of low-grade glioma in ultrasound images during neurosurgery.

Ultrasound Segmentation Neurological Methodology In Silico Academic Lab Benchmark SOTA

Enhancing Oral Health Diagnostics With Hyperspectral Imaging and Computer Vision: Clinical Dataset Study.

Römer P, Ponciano JJ, Kloster K, Siegberg F, Plaß B, Vinayahalingam S, Al-Nawas B, Kämmerer PW, Klauer T, Thiem D

•papers•Sep 11 2025

Diseases of the oral cavity, including oral squamous cell carcinoma, pose major challenges to health care worldwide due to their late diagnosis and complicated differentiation of oral tissues. The combination of endoscopic hyperspectral imaging (HSI) and deep learning (DL) models offers a promising approach to the demand for modern, noninvasive tissue diagnostics. This study presents a large-scale in vivo dataset designed to support DL-based segmentation and classification of healthy oral tissues. This study aimed to develop a comprehensive, annotated endoscopic HSI dataset of the oral cavity and to demonstrate automated, reliable differentiation of intraoral tissue structures by integrating endoscopic HSI with advanced machine learning methods. A total of 226 participants (166 women [73.5%], 60 men [26.5%], aged 24-87 years) were examined using an endoscopic HSI system, capturing spectral data in the range of 500 to 1000 nm. Oral structures in red, green, and blue and HSI scans were annotated using RectLabel Pro (by Ryo Kawamura). DeepLabv3 (Google Research) with a ResNet-50 backbone was adapted for endoscopic HSI segmentation. The model was trained for 50 epochs on 70% of the dataset, with 30% for evaluation. Performance metrics (precision, recall, and F1-score) confirmed its efficacy in distinguishing oral tissue types. DeepLabv3 (ResNet-101) and U-Net (EfficientNet-B0/ResNet-50) achieved the highest overall F1-scores of 0.857 and 0.84, respectively, particularly excelling in segmenting the mucosa (0.915), retractor (0.94), tooth (0.90), and palate (0.90). Variability analysis confirmed high spectral diversity across tissue classes, supporting the dataset's complexity and authenticity for realistic clinical conditions. The presented dataset addresses a key gap in oral health imaging by developing and validating robust DL algorithms for endoscopic HSI data. It enables accurate classification of oral tissue and paves the way for future applications in individualized noninvasive pathological tissue analysis, early cancer detection, and intraoperative diagnostics of oral diseases.

OCT Segmentation Dataset Release In Silico Academic Lab Open Dataset

A Gabor-enhanced deep learning approach with dual-attention for 3D MRI brain tumor segmentation.

Chamseddine E, Tlig L, Chaari L, Sayadi M

•papers•Sep 11 2025

Robust 3D brain tumor MRI segmentation is significant for diagnosis and treatment. However, the tumor heterogeneity, irregular shape, and complicated texture are challenging. Deep learning has transformed medical image analysis by feature extraction directly from the data, greatly enhancing the accuracy of segmentation. The functionality of deep models can be complemented by adding modules like texture-sensitive customized convolution layers and attention mechanisms. These components allow the model to focus its attention on pertinent locations and boundary definition problems. In this paper, a texture-aware deep learning method that improves the U-Net structure by adding a trainable Gabor convolution layer in the input for rich textural feature capture is proposed. Such features are fused in parallel with standard convolutional outputs to better represent tumors. The model also utilizes dual attention modules, Squeeze-and-Excitation blocks in the encoder for dynamically adjusting channel-wise features and Attention Gates for boosting skip connections by removing trivial areas and weighting tumor areas. The working of each module is explored through explainable artificial intelligence methods to ensure interpretability. To address class imbalance, a weighted combined loss function is applied. The model achieves Dice coefficients of 91.62%, 89.92%, and 88.86% for whole tumor, tumor core, and enhancing tumor respectively on BraTS2021 dataset. Large-scale quantitative and qualitative evaluations on BraTS2021, validated on BraTS benchmarks, prove the accuracy and robustness of the proposed model. The proposed approach results are superior to benchmark U-Net and other state-of-the-art segmentation methods, offering a robust and interpretable solution for clinical use.

MRI Segmentation Neurological Methodology In Silico Benchmark SOTA

Filter Papers

Tags

Novel BDefRCNLSTM: an efficient ensemble deep learning approaches for enhanced brain tumor detection and categorization with segmentation.

Ultrasam: a foundation model for ultrasound using large open-access segmentation datasets.

A full-scale attention-augmented CNN-transformer model for segmentation of oropharyngeal mucosa organs-at-risk in radiotherapy.

Resource-Efficient Glioma Segmentation on Sub-Saharan MRI

Medverse: A Universal Model for Full-Resolution 3D Medical Image Segmentation, Transformation and Enhancement

Enhancing 3D Medical Image Understanding with Pretraining Aided by 2D Multimodal Large Language Models

Training With Local Data Remains Important for Deep Learning MRI Prostate Cancer Detection.

U-ConvNext: A Robust Approach to Glioma Segmentation in Intraoperative Ultrasound.

Enhancing Oral Health Diagnostics With Hyperspectral Imaging and Computer Vision: Clinical Dataset Study.

A Gabor-enhanced deep learning approach with dual-attention for 3D MRI brain tumor segmentation.

Ready to Sharpen Your Edge?