Latest Papers on Radiology AI. Tags: Segmentation

Citrus-V: Advancing Medical Foundation Models with Unified Medical Image Grounding for Clinical Reasoning

Guoxin Wang, Jun Zhao, Xinyi Liu, Yanbo Liu, Xuyang Cao, Chao Li, Zhuoyun Liu, Qintian Sun, Fangru Zhou, Haoqiang Xing, Zhenhong Yang

•preprint•Sep 23 2025

Medical imaging provides critical evidence for clinical diagnosis, treatment planning, and surgical decisions, yet most existing imaging models are narrowly focused and require multiple specialized networks, limiting their generalization. Although large-scale language and multimodal models exhibit strong reasoning and multi-task capabilities, real-world clinical applications demand precise visual grounding, multimodal integration, and chain-of-thought reasoning. We introduce Citrus-V, a multimodal medical foundation model that combines image analysis with textual reasoning. The model integrates detection, segmentation, and multimodal chain-of-thought reasoning, enabling pixel-level lesion localization, structured report generation, and physician-like diagnostic inference in a single framework. We propose a novel multimodal training approach and release a curated open-source data suite covering reasoning, detection, segmentation, and document understanding tasks. Evaluations demonstrate that Citrus-V outperforms existing open-source medical models and expert-level imaging systems across multiple benchmarks, delivering a unified pipeline from visual grounding to clinical reasoning and supporting precise lesion quantification, automated reporting, and reliable second opinions.

Mixed Modality Segmentation Whole Body Methodology In Silico Academic Lab Breakthrough Open Code GenAI

Enhancing AI-based decision support system with automatic brain tumor segmentation for EGFR mutation classification.

Gökmen N, Kocadağlı O, Cevik S, Aktan C, Eghbali R, Liu C

•papers•Sep 23 2025

Glioblastoma (GBM) carries poor prognosis; epidermal-growth-factor-receptor (EGFR) mutations further shorten survival. We propose a fully automated MRI-based decision-support system (DSS) that segments GBM and classifies EGFR status, reducing reliance on invasive biopsy. The segmentation module (UNet SI) fuses multiresolution, entropy-ranked shearlet features with CNN features, preserving fine detail through identity long-skip connections, to yield a Lightweight 1.9 M-parameter network. Tumour masks are fed to an Inception ResNet-v2 classifier via a 512-D bottleneck. The pipeline was five-fold cross-validated on 98 contrast-enhanced T1-weighted scans (Memorial Hospital; Ethics 24.12.2021/008) and externally validated on BraTS 2019. On the Memorial cohort UNet SI achieved Dice 0.873, Jaccard 0.853, SSIM 0.992, HD95 24.19 mm. EGFR classification reached Accuracy 0.960, Precision 1.000, Recall 0.871, AUC 0.94, surpassing published state-of-the-art results. Inference time is ≤ 0.18 s per slice on a 4 GB GPU. By combining shearlet-enhanced segmentation with streamlined classification, the DSS delivers superior EGFR prediction and is suitable for integration into routine clinical workflows.

MRI Segmentation Neurological Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Volume Fusion-based Self-Supervised Pretraining for 3D Medical Image Segmentation.

Wang G, Fu J, Wu J, Luo X, Zhou Y, Liu X, Li K, Lin J, Shen B, Zhang S

•papers•Sep 22 2025

The performance of deep learning models for medical image segmentation is often limited in scenarios where training data or annotations are limited. Self-Supervised Learning (SSL) is an appealing solution for this dilemma due to its feature learning ability from a large amount of unannotated images. Existing SSL methods have focused on pretraining either an encoder for global feature representation or an encoder-decoder structure for image restoration, where the gap between pretext and downstream tasks limits the usefulness of pretrained decoders in downstream segmentation. In this work, we propose a novel SSL strategy named Volume Fusion (VolF) for pretraining 3D segmentation models. It minimizes the gap between pretext and downstream tasks by introducing a pseudo-segmentation pretext task, where two sub-volumes are fused by a discretized block-wise fusion coefficient map. The model takes the fused result as input and predicts the category of fusion coefficient for each voxel, which can be trained with standard supervised segmentation loss functions without manual annotations. Experiments with an abdominal CT dataset for pretraining and both in-domain and out-domain downstream datasets showed that VolF led to large performance gain from training from scratch with faster convergence speed, and outperformed several state-of-the-art SSL methods. In addition, it is general to different network structures, and the learned features have high generalizability to different body parts and modalities.

CT Segmentation Abdominal Methodology In Silico

Beyond Diagnosis: Evaluating Multimodal LLMs for Pathology Localization in Chest Radiographs

Advait Gosai, Arun Kavishwar, Stephanie L. McNamara, Soujanya Samineni, Renato Umeton, Alexander Chowdhury, William Lotter

•preprint•Sep 22 2025

Recent work has shown promising performance of frontier large language models (LLMs) and their multimodal counterparts in medical quizzes and diagnostic tasks, highlighting their potential for broad clinical utility given their accessible, general-purpose nature. However, beyond diagnosis, a fundamental aspect of medical image interpretation is the ability to localize pathological findings. Evaluating localization not only has clinical and educational relevance but also provides insight into a model's spatial understanding of anatomy and disease. Here, we systematically assess two general-purpose MLLMs (GPT-4 and GPT-5) and a domain-specific model (MedGemma) in their ability to localize pathologies on chest radiographs, using a prompting pipeline that overlays a spatial grid and elicits coordinate-based predictions. Averaged across nine pathologies in the CheXlocalize dataset, GPT-5 exhibited a localization accuracy of 49.7%, followed by GPT-4 (39.1%) and MedGemma (17.7%), all lower than a task-specific CNN baseline (59.9%) and a radiologist benchmark (80.1%). Despite modest performance, error analysis revealed that GPT-5's predictions were largely in anatomically plausible regions, just not always precisely localized. GPT-4 performed well on pathologies with fixed anatomical locations, but struggled with spatially variable findings and exhibited anatomically implausible predictions more frequently. MedGemma demonstrated the lowest performance on all pathologies, showing limited capacity to generalize to this novel task. Our findings highlight both the promise and limitations of current MLLMs in medical imaging and underscore the importance of integrating them with task-specific tools for reliable use.

X-Ray Segmentation Chest Retrospective Clinical In Silico Academic Lab Benchmark SOTA

A CNN Autoencoder for Learning Latent Disc Geometry from Segmented Lumbar Spine MRI.

Perrone M, Moore DM, Ukeba D, Martin JT

•papers•Sep 22 2025

Low back pain is the world's leading cause of disability and pathology of the lumbar intervertebral discs is frequently considered a driver of pain. The geometric characteristics of intervertebral discs offer valuable insights into their mechanical behavior and pathological conditions. In this study, we present a convolutional neural network (CNN) autoencoder to extract latent features from segmented disc MRI. Additionally, we interpret these latent features and demonstrate their utility in identifying disc pathology, providing a complementary perspective to standard geometric measures. We examined 195 sagittal T1-weighted MRI of the lumbar spine from a publicly available multi-institutional dataset. The proposed pipeline includes five main steps: (1) segmenting MRI, (2) training the CNN autoencoder and extracting latent geometric features, (3) measuring standard geometric features, (4) predicting disc narrowing with latent and/or standard geometric features and (5) determining the relationship between latent and standard geometric features. Our segmentation model achieved an intersection over union (IoU) of 0.82 (95% CI 0.80-0.84) and dice similarity coefficient (DSC) of 0.90 (95% CI 0.89-0.91). The minimum bottleneck size for which the CNN autoencoder converged was 4 × 1 after 350 epochs (IoU of 0.9984-95% CI 0.9979-0.9989). Combining latent and geometric features improved predictions of disc narrowing compared to use either feature set alone. Latent geometric features encoded for disc shape and angular orientation. This study presents a CNN autoencoder to extract latent features from segmented lumbar disc MRI, enhancing disc narrowing prediction and feature interpretability. Future work will integrate disc voxel intensity to analyze composition.

MRI Segmentation Musculoskeletal Methodology In Silico

A multi-class segmentation model of deep learning on contrast-enhanced computed tomography to segment and differentiate lipid-poor adrenal nodules: a dual-center study.

Bai X, Wu Z, Lu L, Zhang H, Zheng H, Zhang Y, Liu X, Zhang Z, Zhang G, Zhang D, Jin Z, Sun H

•papers•Sep 22 2025

To develop a deep-learning model for segmenting and classifying adrenal nodules as either lipid-poor adenoma (LPA) or nodular hyperplasia (NH) on contrast-enhanced computed tomography (CECT) images. This retrospective dual-center study included 164 patients (median age 51.0 years; 93 females) with pathologically confirmed LPA or NH. The model was trained on 128 patients from the internal center and validated on 36 external cases. Radiologists annotated adrenal glands and nodules on 1-mm portal-venous phase CT images. We proposed Mamba-USeg, a novel state-space models (SSMs)-based multi-class segmentation method that performs simultaneous segmentation and classification. Performance was evaluated using the mean Dice similarity coefficient (mDSC) for segmentation and sensitivity/specificity for classification, with comparisons made against MultiResUNet and CPFNet. From per-slice segmentation, the model yielded an mDSC of 0.855 for the adrenal gland; for nodule segmentation, it achieved mDSCs of 0.869 (LPA) and 0.863 (NH), significantly outperforming two previous models-MultiResUNet (LPA, p < 0.001; NH, p = 0.014) and CPFNet (LPA, p = 0.003; NH, p = 0.023). Classification performance from per slice demonstrated sensitivity of 95.3% (95% confidence interval [CI] 91.3-96.6%) and specificity of 92.7% (95% CI: 91.9-93.6%) for LPA, and sensitivity of 94.2% (95% CI: 89.7-97.7%) and specificity of 91.5% (95% CI: 90.4-92.4%) for NH. The classification accuracy for patients from external sources was 91.7% (95% CI: 76.8-98.9%). The proposed multi-class segmentation model can accurately segment and differentiate between LPA and NH on CECT images, demonstrating superior performance to existing methods. Question Accurate differentiation between LPA and NH on imaging remains clinically challenging yet critically important for guiding appropriate treatment approaches. Findings Mamba-Useg, a multi-class segmentation model utilizing pixel-level analysis and majority voting strategies, can accurately segment and classify adrenal nodules as LPA or NH. Clinical relevance The proposed multi-class segmentation model can simultaneously segment and classify adrenal nodules, outperforming previous models in accuracy; it significantly aids clinical decision-making and thereby reduces unnecessary surgeries in adrenal hyperplasia patients.

CT Segmentation Abdominal Retrospective Clinical In Silico Benchmark SOTA

Artificial Intelligence-Assisted Treatment Planning in an Interdisciplinary Rehabilitation in the Esthetic Zone.

Fonseca FJPO, Matias BBR, Pacheco P, Muraoka CSAS, Silva EVF, Sesma N

•papers•Sep 22 2025

This case report elucidates the application of an integrated digital workflow in which diagnosis, planning, and execution were enhanced by artificial intelligence (AI), enabling an assertive interdisciplinary esthetic-functional rehabilitation. With AI-powered software, the sequence from orthodontic treatment to the final rehabilitation achieved high predictability, addressing patient's chief complaints. A patient presented with a missing maxillary left central incisor (tooth 11) and dissatisfaction with a removable partial denture. Clinical examination revealed a gummy smile, a deviated midline, and a disproportionate mesiodistal space relative to the midline. Initial documentation included photographs, intraoral scanning, and cone-beam computed tomography of the maxilla. These data were integrated into a digital planning software to create an interdisciplinary plan. This workflow included prosthetically guided orthodontic treatment with aligners, a motivational mockup, guided implant surgery, peri-implant soft tissue management, and final prosthetic rehabilitation using a CAD/CAM approach. This digital workflow enhanced communication among the multidisciplinary team and with the patient, ensuring highly predictable esthetic and functional outcomes. Comprehensive digital workflows improve diagnostic accuracy, streamline planning with AI, and facilitate patient understanding. This approach increases patient satisfaction, supports interdisciplinary collaboration, and promotes treatment adherence.

CT Segmentation Retrospective Clinical Clinical Pilot Academic Lab

Automated Labeling of Intracranial Arteries with Uncertainty Quantification Using Deep Learning

Javier Bisbal, Patrick Winter, Sebastian Jofre, Aaron Ponce, Sameer A. Ansari, Ramez Abdalla, Michael Markl, Oliver Welin Odeback, Sergio Uribe, Cristian Tejos, Julio Sotelo, Susanne Schnell, David Marlevi

•preprint•Sep 22 2025

Accurate anatomical labeling of intracranial arteries is essential for cerebrovascular diagnosis and hemodynamic analysis but remains time-consuming and subject to interoperator variability. We present a deep learning-based framework for automated artery labeling from 3D Time-of-Flight Magnetic Resonance Angiography (3D ToF-MRA) segmentations (n=35), incorporating uncertainty quantification to enhance interpretability and reliability. We evaluated three convolutional neural network architectures: (1) a UNet with residual encoder blocks, reflecting commonly used baselines in vascular labeling; (2) CS-Net, an attention-augmented UNet incorporating channel and spatial attention mechanisms for enhanced curvilinear structure recognition; and (3) nnUNet, a self-configuring framework that automates preprocessing, training, and architectural adaptation based on dataset characteristics. Among these, nnUNet achieved the highest labeling performance (average Dice score: 0.922; average surface distance: 0.387 mm), with improved robustness in anatomically complex vessels. To assess predictive confidence, we implemented test-time augmentation (TTA) and introduced a novel coordinate-guided strategy to reduce interpolation errors during augmented inference. The resulting uncertainty maps reliably indicated regions of anatomical ambiguity, pathological variation, or manual labeling inconsistency. We further validated clinical utility by comparing flow velocities derived from automated and manual labels in co-registered 4D Flow MRI datasets, observing close agreement with no statistically significant differences. Our framework offers a scalable, accurate, and uncertainty-aware solution for automated cerebrovascular labeling, supporting downstream hemodynamic analysis and facilitating clinical integration.

MRI Segmentation Neurological Methodology In Silico Academic Lab GenAI

MRI-based habitat analysis for pathologic response prediction after neoadjuvant chemoradiotherapy in rectal cancer: a multicenter study.

Chen Q, Zhang Q, Li Z, Zhang S, Xia Y, Wang H, Lu Y, Zheng A, Shao C, Shen F

•papers•Sep 22 2025

To investigate MRI-based habitat analysis for its value in predicting pathologic response following neoadjuvant chemoradiotherapy (nCRT) in rectal cancer (RC) patients. 1021 RC patients in three hospitals were divided into the training and test sets (n = 319), the internal validation set (n = 317), and external validation sets 1 (n = 158) and 2 (n = 227). Deep learning was performed to automatically segment the entire lesion on high-resolution MRI. Simple linear iterative clustering was used to divide each tumor into subregions, from which radiomics features were extracted. The optimal number of clusters reflecting the diversity of the tumor ecosystem was determined. Finally, four models were developed: clinical, intratumoral heterogeneity (ITH)-based, radiomics, and fusion models. The performance of these models was evaluated. The impact of nCRT on disease-free survival (DFS) was further analyzed. The Delong test revealed the fusion model (AUCs of 0.867, 0.851, 0.852, and 0.818 in the four cohorts, respectively), the radiomics model (0.831, 0.694, 0.753, and 0.705, respectively), and the ITH model (0.790, 0.786, 0.759, and 0.722, respectively) were all superior to the clinical model (0.790, 0.605, 0.735, and 0.704, respectively). However, no significant differences were detected between the fusion and ITH models. Patients stratified using the fusion model showed significant differences in DFS between the good and poor response groups (all p < 0.05 in the four sets). The fusion model combining clinical factors, radiomics features, and ITH features may help predict pathologic response in RC cases receiving nCRT. Question Identifying rectal cancer (RC) patients likely to benefit from neoadjuvant chemoradiotherapy (nCRT) before treatment is crucial. Findings The fusion model shows the best performance in predicting response after neoadjuvant chemoradiotherapy. Clinical relevance The fusion model integrates clinical characteristics, radiomics features, and intratumoral heterogeneity (ITH)features, which can be applied for the prediction of response to nCRT in RC patients, offering potential benefits in terms of personalized treatment strategies.

MRI Segmentation Abdominal Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Training the next generation of physicians for artificial intelligence-assisted clinical neuroradiology: ASNR MICCAI Brain Tumor Segmentation (BraTS) 2025 Lighthouse Challenge education platform

Raisa Amiruddin, Nikolay Y. Yordanov, Nazanin Maleki, Pascal Fehringer, Athanasios Gkampenis, Anastasia Janas, Kiril Krantchev, Ahmed Moawad, Fabian Umeh, Salma Abosabie, Sara Abosabie, Albara Alotaibi, Mohamed Ghonim, Mohanad Ghonim, Sedra Abou Ali Mhana, Nathan Page, Marko Jakovljevic, Yasaman Sharifi, Prisha Bhatia, Amirreza Manteghinejad, Melisa Guelen, Michael Veronesi, Virginia Hill, Tiffany So, Mark Krycia, Bojan Petrovic, Fatima Memon, Justin Cramer, Elizabeth Schrickel, Vilma Kosovic, Lorenna Vidal, Gerard Thompson, Ichiro Ikuta, Basimah Albalooshy, Ali Nabavizadeh, Nourel Hoda Tahon, Karuna Shekdar, Aashim Bhatia, Claudia Kirsch, Gennaro D'Anna, Philipp Lohmann, Amal Saleh Nour, Andriy Myronenko, Adam Goldman-Yassen, Janet R. Reid, Sanjay Aneja, Spyridon Bakas, Mariam Aboian

•preprint•Sep 21 2025

High-quality reference standard image data creation by neuroradiology experts for automated clinical tools can be a powerful tool for neuroradiology & artificial intelligence education. We developed a multimodal educational approach for students and trainees during the MICCAI Brain Tumor Segmentation Lighthouse Challenge 2025, a landmark initiative to develop accurate brain tumor segmentation algorithms. Fifty-six medical students & radiology trainees volunteered to annotate brain tumor MR images for the BraTS challenges of 2023 & 2024, guided by faculty-led didactics on neuropathology MRI. Among the 56 annotators, 14 select volunteers were then paired with neuroradiology faculty for guided one-on-one annotation sessions for BraTS 2025. Lectures on neuroanatomy, pathology & AI, journal clubs & data scientist-led workshops were organized online. Annotators & audience members completed surveys on their perceived knowledge before & after annotations & lectures respectively. Fourteen coordinators, each paired with a neuroradiologist, completed the data annotation process, averaging 1322.9+/-760.7 hours per dataset per pair and 1200 segmentations in total. On a scale of 1-10, annotation coordinators reported significant increase in familiarity with image segmentation software pre- and post-annotation, moving from initial average of 6+/-2.9 to final average of 8.9+/-1.1, and significant increase in familiarity with brain tumor features pre- and post-annotation, moving from initial average of 6.2+/-2.4 to final average of 8.1+/-1.2. We demonstrate an innovative offering for providing neuroradiology & AI education through an image segmentation challenge to enhance understanding of algorithm development, reinforce the concept of data reference standard, and diversify opportunities for AI-driven image analysis among future physicians.

MRI Segmentation Neurological Retrospective Clinical Concept Consortium Open Dataset

Filter Papers

Tags

Citrus-V: Advancing Medical Foundation Models with Unified Medical Image Grounding for Clinical Reasoning

Enhancing AI-based decision support system with automatic brain tumor segmentation for EGFR mutation classification.

Volume Fusion-based Self-Supervised Pretraining for 3D Medical Image Segmentation.

Beyond Diagnosis: Evaluating Multimodal LLMs for Pathology Localization in Chest Radiographs

A CNN Autoencoder for Learning Latent Disc Geometry from Segmented Lumbar Spine MRI.

A multi-class segmentation model of deep learning on contrast-enhanced computed tomography to segment and differentiate lipid-poor adrenal nodules: a dual-center study.

Artificial Intelligence-Assisted Treatment Planning in an Interdisciplinary Rehabilitation in the Esthetic Zone.

Automated Labeling of Intracranial Arteries with Uncertainty Quantification Using Deep Learning

MRI-based habitat analysis for pathologic response prediction after neoadjuvant chemoradiotherapy in rectal cancer: a multicenter study.

Training the next generation of physicians for artificial intelligence-assisted clinical neuroradiology: ASNR MICCAI Brain Tumor Segmentation (BraTS) 2025 Lighthouse Challenge education platform

Ready to Sharpen Your Edge?