Sort by:
Page 4 of 99981 results

MedReasoner: Reinforcement Learning Drives Reasoning Grounding from Clinical Thought to Pixel-Level Precision

Zhonghao Yan, Muxi Diao, Yuxuan Yang, Jiayuan Xu, Kaizhou Zhang, Ruoyan Jing, Lele Yang, Yanxi Liu, Kongming Liang, Zhanyu Ma

arxiv logopreprintAug 11 2025
Accurately grounding regions of interest (ROIs) is critical for diagnosis and treatment planning in medical imaging. While multimodal large language models (MLLMs) combine visual perception with natural language, current medical-grounding pipelines still rely on supervised fine-tuning with explicit spatial hints, making them ill-equipped to handle the implicit queries common in clinical practice. This work makes three core contributions. We first define Unified Medical Reasoning Grounding (UMRG), a novel vision-language task that demands clinical reasoning and pixel-level grounding. Second, we release U-MRG-14K, a dataset of 14K samples featuring pixel-level masks alongside implicit clinical queries and reasoning traces, spanning 10 modalities, 15 super-categories, and 108 specific categories. Finally, we introduce MedReasoner, a modular framework that distinctly separates reasoning from segmentation: an MLLM reasoner is optimized with reinforcement learning, while a frozen segmentation expert converts spatial prompts into masks, with alignment achieved through format and accuracy rewards. MedReasoner achieves state-of-the-art performance on U-MRG-14K and demonstrates strong generalization to unseen clinical queries, underscoring the significant promise of reinforcement learning for interpretable medical grounding.

LR-COBRAS: A logic reasoning-driven interactive medical image data annotation algorithm.

Zhou N, Cao J

pubmed logopapersAug 11 2025
The volume of image data generated in the medical field is continuously increasing. Manual annotation is both costly and prone to human error. Additionally, deep learning-based medical image algorithms rely on large, accurately annotated training datasets, which are expensive to produce and often result in instability. This study introduces LR-COBRAS, an interactive computer-aided data annotation algorithm designed for medical experts. LR-COBRAS aims to assist healthcare professionals in achieving more precise annotation outcomes through interactive processes, thereby optimizing medical image annotation tasks. The algorithm enhances must-link and cannot-link constraints during interactions through a logic reasoning module. It automatically generates potential constraint relationships, reducing the frequency of user interactions and improving clustering accuracy. By utilizing rules such as symmetry, transitivity, and consistency, LR-COBRAS effectively balances automation with clinical relevance. Experimental results based on the MedMNIST+ dataset and ChestX-ray8 dataset demonstrate that LR-COBRAS significantly outperforms existing methods in clustering accuracy, efficiency, and interactive burden, showcasing superior robustness and applicability. This algorithm provides a novel solution for intelligent medical image analysis. The source code for our implementation is available on https://github.com/cjw-bbxc/MILR-COBRAS.

The eyelid and pupil dynamics underlying stress levels in awake mice.

Zeng, H.

biorxiv logopreprintAug 10 2025
Stress is a natural response of the body to perceived threats, and it can have both positive and negative effects on brain hemodynamics. Stress-induced changes in pupil and eyelid size/shape have been used as a biomarker in several fMRI studies. However, there were limited knowledges regarding changes in behavior of pupil and eyelid dynamics, particularly on animal models. In the present study, the pupil and eyelid dynamics were carefully investigated and characterized in a newly developed awake rodent fMRI protocol. Leveraging deep learning techniques, the mouse pupil and eyelid diameters were extracted and analyzed during different training and imaging phases in the present project. Our findings demonstrate a consistent downwards trend in pupil and eyelid dynamics under a meticulously designed training protocol, suggesting that the behaviors of the pupil and eyelid can be served as reliable indicators of stress levels and motion artifacts in awake fMRI studies. The current recording platform not only enables the facilitation of awake animal MRI studies but also highlights its potential applications to numerous other research areas, owing to the non-invasive nature and straightforward implementation.

SynMatch: Rethinking Consistency in Medical Image Segmentation with Sparse Annotations

Zhiqiang Shen, Peng Cao, Xiaoli Liu, Jinzhu Yang, Osmar R. Zaiane

arxiv logopreprintAug 10 2025
Label scarcity remains a major challenge in deep learning-based medical image segmentation. Recent studies use strong-weak pseudo supervision to leverage unlabeled data. However, performance is often hindered by inconsistencies between pseudo labels and their corresponding unlabeled images. In this work, we propose \textbf{SynMatch}, a novel framework that sidesteps the need for improving pseudo labels by synthesizing images to match them instead. Specifically, SynMatch synthesizes images using texture and shape features extracted from the same segmentation model that generates the corresponding pseudo labels for unlabeled images. This design enables the generation of highly consistent synthesized-image-pseudo-label pairs without requiring any training parameters for image synthesis. We extensively evaluate SynMatch across diverse medical image segmentation tasks under semi-supervised learning (SSL), weakly-supervised learning (WSL), and barely-supervised learning (BSL) settings with increasingly limited annotations. The results demonstrate that SynMatch achieves superior performance, especially in the most challenging BSL setting. For example, it outperforms the recent strong-weak pseudo supervision-based method by 29.71\% and 10.05\% on the polyp segmentation task with 5\% and 10\% scribble annotations, respectively. The code will be released at https://github.com/Senyh/SynMatch.

LWT-ARTERY-LABEL: A Lightweight Framework for Automated Coronary Artery Identification

Shisheng Zhang, Ramtin Gharleghi, Sonit Singh, Daniel Moses, Dona Adikari, Arcot Sowmya, Susann Beier

arxiv logopreprintAug 9 2025
Coronary artery disease (CAD) remains the leading cause of death globally, with computed tomography coronary angiography (CTCA) serving as a key diagnostic tool. However, coronary arterial analysis using CTCA, such as identifying artery-specific features from computational modelling, is labour-intensive and time-consuming. Automated anatomical labelling of coronary arteries offers a potential solution, yet the inherent anatomical variability of coronary trees presents a significant challenge. Traditional knowledge-based labelling methods fall short in leveraging data-driven insights, while recent deep-learning approaches often demand substantial computational resources and overlook critical clinical knowledge. To address these limitations, we propose a lightweight method that integrates anatomical knowledge with rule-based topology constraints for effective coronary artery labelling. Our approach achieves state-of-the-art performance on benchmark datasets, providing a promising alternative for automated coronary artery labelling.

Spatio-Temporal Conditional Diffusion Models for Forecasting Future Multiple Sclerosis Lesion Masks Conditioned on Treatments

Gian Mario Favero, Ge Ya Luo, Nima Fathi, Justin Szeto, Douglas L. Arnold, Brennan Nichyporuk, Chris Pal, Tal Arbel

arxiv logopreprintAug 9 2025
Image-based personalized medicine has the potential to transform healthcare, particularly for diseases that exhibit heterogeneous progression such as Multiple Sclerosis (MS). In this work, we introduce the first treatment-aware spatio-temporal diffusion model that is able to generate future masks demonstrating lesion evolution in MS. Our voxel-space approach incorporates multi-modal patient data, including MRI and treatment information, to forecast new and enlarging T2 (NET2) lesion masks at a future time point. Extensive experiments on a multi-centre dataset of 2131 patient 3D MRIs from randomized clinical trials for relapsing-remitting MS demonstrate that our generative model is able to accurately predict NET2 lesion masks for patients across six different treatments. Moreover, we demonstrate our model has the potential for real-world clinical applications through downstream tasks such as future lesion count and location estimation, binary lesion activity classification, and generating counterfactual future NET2 masks for several treatments with different efficacies. This work highlights the potential of causal, image-based generative models as powerful tools for advancing data-driven prognostics in MS.

"AI tumor delineation for all breathing phases in early-stage NSCLC".

DelaO-Arevalo LR, Sijtsema NM, van Dijk LV, Langendijk JA, Wijsman R, van Ooijen PMA

pubmed logopapersAug 9 2025
Accurate delineation of the Gross Tumor Volume (GTV) and the Internal Target Volume (ITV) in early-stage lung tumors is crucial in Stereotactic Body Radiation Therapy (SBRT). Traditionally, the ITVs, which account for breathing motion, are generated by manually contouring GTVs across all breathing phases (BPs), a time-consuming process. This research aims to streamline this workflow by developing a deep learning algorithm to automatically delineate GTVs in all four-dimensional computed tomography (4D-CT) BPs for early-stage Non-Small Cell Lung Cancer Patients (NSCLC). A dataset of 214 early-stage NSCLC patients treated with SBRT was used. Each patient had a 4D-CT scan containing ten reconstructed BPs. The data were divided into a training set (75 %) and a testing set (25 %). Three models SwinUNetR and Dynamic UNet (DynUnet), and a hybrid model combining both (Swin + Dyn)were trained and evaluated using the Dice Similarity Coefficient (DSC), 3 mm Surface Dice Similarity Coefficient (SDSC), and the 95<sup>th</sup> percentile Hausdorff distance (HD95). The best performing model was used to delineate GTVs in all test set BPs, creating the ITVs using two methods: all 10 phases and the maximum inspiration/expiration phases. The ITVs were compared to the ground truth ITVs. The Swin + Dyn model achieved the highest performance, with a test set SDSC of 0.79 ± 0.14 for GTV 50 %. For the ITVs, the SDSC was 0.79 ± 0.16 using all 10 BPs and 0.77 ± 0.14 using 2 BPs. At the voxel level, the Swin + DynNet network achieved a sensitivity of 0.75 ± 0.14 and precision of 0.84 ± 0.10 for the ITV 2 breathing phases, and a sensitivity of 0.79 ± 0.12 and precision of 0.80 ± 0.11 for the 10 breathing phases. The Swin + Dyn Net algorithm, trained on the maximum expiration CT-scan effectively delineated gross tumor volumes in all breathing phases and the resulting ITV showed a good agreement with the ground truth (surface DSC = 0.79 ± 0.16 using all 10 BPs and 0.77 ± 0.14 using 2 BPs.). The proposed approach could reduce delineation time and inter-performer variability in the tumor contouring process for NSCLC SBRT workflows.

SST-DUNet: Smart Swin Transformer and Dense UNet for automated preclinical fMRI skull stripping.

Soltanpour S, Utama R, Chang A, Nasseef MT, Madularu D, Kulkarni P, Ferris CF, Joslin C

pubmed logopapersAug 9 2025
Skull stripping is a common preprocessing step in Magnetic Resonance Imaging (MRI) pipelines and is often performed manually. Automating this process is challenging for preclinical data due to variations in brain geometry, resolution, and tissue contrast. Existing methods for MRI skull stripping often struggle with the low resolution and varying slice sizes found in preclinical functional MRI (fMRI) data. This study proposes a novel method that integrates a Dense UNet-based architecture with a feature extractor based on the Smart Swin Transformer (SST), called SST-DUNet. The Smart Shifted Window Multi-Head Self-Attention (SSW-MSA) module in SST replaces the mask-based module in the Swin Transformer (ST), enabling the learning of distinct channel-wise features while focusing on relevant dependencies within brain structures. This modification allows the model to better handle the complexities of fMRI skull stripping, such as low resolution and variable slice sizes. To address class imbalance in preclinical data, a combined loss function using Focal and Dice loss is applied. The model was trained on rat fMRI images and evaluated across three in-house datasets, achieving Dice similarity scores of 98.65%, 97.86%, and 98.04%. We compared our method with conventional and deep learning-based approaches, demonstrating its superiority over state-of-the-art methods. The fMRI results using SST-DUNet closely align with those from manual skull stripping for both seed-based and independent component analyses, indicating that SST-DUNet can effectively substitute manual brain extraction in rat fMRI analysis.

Supporting intraoperative margin assessment using deep learning for automatic tumour segmentation in breast lumpectomy micro-PET-CT.

Maris L, Göker M, De Man K, Van den Broeck B, Van Hoecke S, Van de Vijver K, Vanhove C, Keereman V

pubmed logopapersAug 9 2025
Complete tumour removal is vital in curative breast cancer (BCa) surgery to prevent recurrence. Recently, [<sup>18</sup>F]FDG micro-PET-CT of lumpectomy specimens has shown promise for intraoperative margin assessment (IMA). To aid interpretation, we trained a 2D Residual U-Net to delineate invasive carcinoma of no special type in micro-PET-CT lumpectomy images. We collected 53 BCa lamella images from 19 patients with true histopathology-defined tumour segmentations. Group five-fold cross-validation yielded a dice similarity coefficient of 0.71 ± 0.20 for segmentation. Afterwards, an ensemble model was generated to segment tumours and predict margin status. Comparing predicted and true histopathological margin status in a separate set of 31 micro-PET-CT lumpectomy images of 31 patients achieved an F1 score of 84%, closely matching the mean performance of seven physicians who manually interpreted the same images. This model represents an important step towards a decision-support system that enhances micro-PET-CT-based IMA in BCa, facilitating its clinical adoption.

Self-supervised disc and cup segmentation via non-local deformable convolution and adaptive transformer.

Zhao W, Wang Y

pubmed logopapersAug 9 2025
Optic disc and cup segmentation is a crucial subfield of computer vision, playing a pivotal role in automated pathological image analysis. It enables precise, efficient, and automated diagnosis of ocular conditions, significantly aiding clinicians in real-world medical applications. However, due to the scarcity of medical segmentation data and the insufficient integration of global contextual information, the segmentation accuracy remains suboptimal. This issue becomes particularly pronounced in optic disc and cup cases with complex anatomical structures and ambiguous boundaries.In order to address these limitations, this paper introduces a self-supervised training strategy integrated with a newly designed network architecture to improve segmentation accuracy.Specifically,we initially propose a non-local dual deformable convolutional block,which aims to capture the irregular image patterns(i.e. boundary).Secondly,we modify the traditional vision transformer and design an adaptive K-Nearest Neighbors(KNN) transformation block to extract the global semantic context from images. Finally,an initialization strategy based on self-supervised training is proposed to reduce the burden on the network on labeled data.Comprehensive experimental evaluations demonstrate the effectiveness of our proposed method, which outperforms previous networks and achieves state-of-the-art performance,with IOU scores of 0.9577 for the optic disc and 0.8399 for the optic cup on the REFUGE dataset.
Page 4 of 99981 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.