Sort by:
Page 10 of 22215 results

HAIBU-ReMUD: Reasoning Multimodal Ultrasound Dataset and Model Bridging to General Specific Domains

Shijie Wang, Yilun Zhang, Zeyu Lai, Dexing Kong

arxiv logopreprintJun 9 2025
Multimodal large language models (MLLMs) have shown great potential in general domains but perform poorly in some specific domains due to a lack of domain-specific data, such as image-text data or vedio-text data. In some specific domains, there is abundant graphic and textual data scattered around, but lacks standardized arrangement. In the field of medical ultrasound, there are ultrasonic diagnostic books, ultrasonic clinical guidelines, ultrasonic diagnostic reports, and so on. However, these ultrasonic materials are often saved in the forms of PDF, images, etc., and cannot be directly used for the training of MLLMs. This paper proposes a novel image-text reasoning supervised fine-tuning data generation pipeline to create specific domain quadruplets (image, question, thinking trace, and answer) from domain-specific materials. A medical ultrasound domain dataset ReMUD is established, containing over 45,000 reasoning and non-reasoning supervised fine-tuning Question Answering (QA) and Visual Question Answering (VQA) data. The ReMUD-7B model, fine-tuned on Qwen2.5-VL-7B-Instruct, outperforms general-domain MLLMs in medical ultrasound field. To facilitate research, the ReMUD dataset, data generation codebase, and ReMUD-7B parameters will be released at https://github.com/ShiDaizi/ReMUD, addressing the data shortage issue in specific domain MLLMs.

Addressing Limited Generalizability in Artificial Intelligence-Based Brain Aneurysm Detection for Computed Tomography Angiography: Development of an Externally Validated Artificial Intelligence Screening Platform.

Pettersson SD, Filo J, Liaw P, Skrzypkowska P, Klepinowski T, Szmuda T, Fodor TB, Ramirez-Velandia F, Zieliński P, Chang YM, Taussky P, Ogilvy CS

pubmed logopapersJun 9 2025
Brain aneurysm detection models, both in the literature and in industry, continue to lack generalizability during external validation, limiting clinical adoption. This challenge is largely due to extensive exclusion criteria during training data selection. The authors developed the first model to achieve generalizability using novel methodological approaches. Computed tomography angiography (CTA) scans from 2004 to 2023 at the study institution were used for model training, including untreated unruptured intracranial aneurysms without extensive cerebrovascular disease. External validation used digital subtraction angiography-verified CTAs from an international center, while prospective validation occurred at the internal institution over 9 months. A public web platform was created for further model validation. A total of 2194 CTA scans were used for this study. One thousand five hundred eighty-seven patients and 1920 aneurysms with a mean size of 5.3 ± 3.7 mm were included in the training cohort. The mean age of the patients was 69.7 ± 14.9 years, and 1203 (75.8%) were female. The model achieved a training Dice score of 0.88 and a validation Dice score of 0.76. Prospective internal validation on 304 scans yielded a lesion-level (LL) sensitivity of 82.5% (95% CI: 75.5-87.9) and specificity of 89.6 (95% CI: 84.5-93.2). External validation on 303 scans demonstrated an on-par LL sensitivity and specificity of 83.5% (95% CI: 75.1-89.4) and 92.9% (95% CI: 88.8-95.6), respectively. Radiologist LL sensitivity from the external center was 84.5% (95% CI: 76.2-90.2), and 87.5% of the missed aneurysms were detected by the model. The authors developed the first publicly testable artificial intelligence model for aneurysm detection on CTA scans, demonstrating generalizability and state-of-the-art performance in external validation. The model addresses key limitations of previous efforts and enables broader validation through a web-based platform.

SMART MRS: A Simulated MEGA-PRESS ARTifacts toolbox for GABA-edited MRS.

Bugler H, Shamaei A, Souza R, Harris AD

pubmed logopapersJun 8 2025
To create a Python-based toolbox to simulate commonly occurring artifacts for single voxel gamma-aminobutyric acid (GABA)-edited MRS data. The toolbox was designed to maximize user flexibility and contains artifact, applied, input/output (I/O), and support functions. The artifact functions can produce spurious echoes, eddy currents, nuisance peaks, line broadening, baseline contamination, linear frequency drifts, and frequency and phase shift artifacts. Applied functions combine or apply specific parameter values to produce recognizable effects such as lipid peak and motion contamination. I/O and support functions provide additional functionality to accommodate different kinds of input data (MATLAB FID-A.mat files, NIfTI-MRS files), which vary by domain (time vs. frequency), MRS data type (e.g., edited vs. non-edited) and scale. A frequency and phase correction machine learning model experiment trained on corrupted simulated data and validated on in vivo data is shown to highlight the utility of our toolbox. Data simulated from the toolbox are complementary for research applications, as demonstrated by training a frequency and phase correction deep learning model that is applied to in vivo data containing artifacts. Visual assessment also confirms the resemblance of simulated artifacts compared to artifacts found in in vivo data. Our easy to install Python artifact simulated toolbox SMART_MRS is useful to enhance the diversity and quality of existing simulated edited-MRS data and is complementary to existing MRS simulation software.

RARL: Improving Medical VLM Reasoning and Generalization with Reinforcement Learning and LoRA under Data and Hardware Constraints

Tan-Hanh Pham, Chris Ngo

arxiv logopreprintJun 7 2025
The growing integration of vision-language models (VLMs) in medical applications offers promising support for diagnostic reasoning. However, current medical VLMs often face limitations in generalization, transparency, and computational efficiency-barriers that hinder deployment in real-world, resource-constrained settings. To address these challenges, we propose a Reasoning-Aware Reinforcement Learning framework, \textbf{RARL}, that enhances the reasoning capabilities of medical VLMs while remaining efficient and adaptable to low-resource environments. Our approach fine-tunes a lightweight base model, Qwen2-VL-2B-Instruct, using Low-Rank Adaptation and custom reward functions that jointly consider diagnostic accuracy and reasoning quality. Training is performed on a single NVIDIA A100-PCIE-40GB GPU, demonstrating the feasibility of deploying such models in constrained environments. We evaluate the model using an LLM-as-judge framework that scores both correctness and explanation quality. Experimental results show that RARL significantly improves VLM performance in medical image analysis and clinical reasoning, outperforming supervised fine-tuning on reasoning-focused tasks by approximately 7.78%, while requiring fewer computational resources. Additionally, we demonstrate the generalization capabilities of our approach on unseen datasets, achieving around 27% improved performance compared to supervised fine-tuning and about 4% over traditional RL fine-tuning. Our experiments also illustrate that diversity prompting during training and reasoning prompting during inference are crucial for enhancing VLM performance. Our findings highlight the potential of reasoning-guided learning and reasoning prompting to steer medical VLMs toward more transparent, accurate, and resource-efficient clinical decision-making. Code and data are publicly available.

De-identification of medical imaging data: a comprehensive tool for ensuring patient privacy.

Rempe M, Heine L, Seibold C, Hörst F, Kleesiek J

pubmed logopapersJun 7 2025
Medical imaging data employed in research frequently comprises sensitive Protected Health Information (PHI) and Personal Identifiable Information (PII), which is subject to rigorous legal frameworks such as the General Data Protection Regulation (GDPR) or the Health Insurance Portability and Accountability Act (HIPAA). Consequently, these types of data must be de-identified prior to utilization, which presents a significant challenge for many researchers. Given the vast array of medical imaging data, it is necessary to employ a variety of de-identification techniques. To facilitate the de-identification process for medical imaging data, we have developed an open-source tool that can be used to de-identify Digital Imaging and Communications in Medicine (DICOM) magnetic resonance images, computer tomography images, whole slide images and magnetic resonance twix raw data. Furthermore, the implementation of a neural network enables the removal of text within the images. The proposed tool reaches comparable results to current state-of-the-art algorithms at reduced computational time (up to × 265). The tool also manages to fully de-identify image data of various types, such as Neuroimaging Informatics Technology Initiative (NIfTI) or Whole Slide Image (WSI-)DICOMS. The proposed tool automates an elaborate de-identification pipeline for multiple types of inputs, reducing the need for additional tools used for de-identification of imaging data. Question How can researchers effectively de-identify sensitive medical imaging data while complying with legal frameworks to protect patient health information? Findings We developed an open-source tool that automates the de-identification of various medical imaging formats, enhancing the efficiency of de-identification processes. Clinical relevance This tool addresses the critical need for robust and user-friendly de-identification solutions in medical imaging, facilitating data exchange in research while safeguarding patient privacy.

Hypothalamus and intracranial volume segmentation at the group level by use of a Gradio-CNN framework.

Vernikouskaya I, Rasche V, Kassubek J, Müller HP

pubmed logopapersJun 6 2025
This study aimed to develop and evaluate a graphical user interface (GUI) for the automated segmentation of the hypothalamus and intracranial volume (ICV) in brain MRI scans. The interface was designed to facilitate efficient and accurate segmentation for research applications, with a focus on accessibility and ease of use for end-users. We developed a web-based GUI using the Gradio library integrating deep learning-based segmentation models trained on annotated brain MRI scans. The model utilizes a U-Net architecture to delineate the hypothalamus and ICV. The GUI allows users to upload high-resolution MRI scans, visualize the segmentation results, calculate hypothalamic volume and ICV, and manually correct individual segmentation results. To ensure widespread accessibility, we deployed the interface using ngrok, allowing users to access the tool via a shared link. As an example for the universality of the approach, the tool was applied to a group of 90 patients with Parkinson's disease (PD) and 39 controls. The GUI demonstrated high usability and efficiency in segmenting the hypothalamus and the ICV, with no significant difference in normalized hypothalamic volume observed between PD patients and controls, consistent with previously published findings. The average processing time per patient volume was 18 s for the hypothalamus and 44 s for the ICV segmentation on a 6 GB NVidia GeForce GTX 1060 GPU. The ngrok-based deployment allowed for seamless access across different devices and operating systems, with an average connection time of less than 5 s. The developed GUI provides a powerful and accessible tool for applications in neuroimaging. The combination of the intuitive interface, accurate deep learning-based segmentation, and easy deployment via ngrok addresses the need for user-friendly tools in brain MRI analysis. This approach has the potential to streamline workflows in neuroimaging research.

A Fully Automatic Pipeline of Identification, Segmentation, and Subtyping of Aortic Dissection from CT Angiography.

Zhuang C, Wu Y, Qi Q, Zhao S, Sun Y, Hou J, Qian W, Yang B, Qi S

pubmed logopapersJun 6 2025
Aortic dissection (AD) is a rare condition with a high mortality rate, necessitating accurate and rapid diagnosis. This study develops an automated deep learning pipeline for identifying, segmenting, and Stanford subtyping AD using computed tomography angiography (CTA) images. This pipeline consists of four interconnected modules: aorta segmentation, AD identification, true lumen (TL) and false lumen (FL) segmentation, and Stanford subtyping. In the aorta segmentation module, a 3D full-resolution nnU-Net is trained. The segmented aorta's boundary is extracted using morphological operations and projected from multiple views in the AD identification module. AD identification is then performed using the multi-view projection data. For AD cases, a 3D nnU-Net is further trained for TL/FL segmentation based on the segmented aorta. Finally, a network is trained for Stanford subtyping using multi-view maximum density projections of the segmented TL/FL. A total of 386 CTA scans were collected for training, validation, and testing of the pipeline. For AD identification, the method achieved an accuracy of 0.979. The TL/FL segmentation for TypeA-AD and Type-B-AD achieved average Dice coefficient of 0.968 for TL and 0.971 for FL. For Stanford subtyping, the multi-view method achieved an accuracy of 0.990. The automated pipeline enables rapid and accurate identification, segmentation, and Stanford subtyping of AD using CTA images, potentially accelerating the diagnosis and treatment. The segmented aorta and TL/FL can also serve as references for physicians. The code, models, and pipeline are publicly available at https://github.com/zhuangCJ/A-pipeline-of-AD.git .

Query Nearby: Offset-Adjusted Mask2Former enhances small-organ segmentation

Xin Zhang, Dongdong Meng, Sheng Li

arxiv logopreprintJun 6 2025
Medical segmentation plays an important role in clinical applications like radiation therapy and surgical guidance, but acquiring clinically acceptable results is difficult. In recent years, progress has been witnessed with the success of utilizing transformer-like models, such as combining the attention mechanism with CNN. In particular, transformer-based segmentation models can extract global information more effectively, compensating for the drawbacks of CNN modules that focus on local features. However, utilizing transformer architecture is not easy, because training transformer-based models can be resource-demanding. Moreover, due to the distinct characteristics in the medical field, especially when encountering mid-sized and small organs with compact regions, their results often seem unsatisfactory. For example, using ViT to segment medical images directly only gives a DSC of less than 50\%, which is far lower than the clinically acceptable score of 80\%. In this paper, we used Mask2Former with deformable attention to reduce computation and proposed offset adjustment strategies to encourage sampling points within the same organs during attention weights computation, thereby integrating compact foreground information better. Additionally, we utilized the 4th feature map in Mask2Former to provide a coarse location of organs, and employed an FCN-based auxiliary head to help train Mask2Former more quickly using Dice loss. We show that our model achieves SOTA (State-of-the-Art) performance on the HaNSeg and SegRap2023 datasets, especially on mid-sized and small organs.Our code is available at link https://github.com/earis/Offsetadjustment\_Background-location\_Decoder\_Mask2former.

GNNs surpass transformers in tumor medical image segmentation.

Xiao H, Yang G, Li Z, Yi C

pubmed logopapersJun 5 2025
To assess the suitability of Transformer-based architectures for medical image segmentation and investigate the potential advantages of Graph Neural Networks (GNNs) in this domain. We analyze the limitations of the Transformer, which models medical images as sequences of image patches, limiting its flexibility in capturing complex and irregular tumor structures. To address it, we propose U-GNN, a pure GNN-based U-shaped architecture designed for medical image segmentation. U-GNN retains the U-Net-inspired inductive bias while leveraging GNNs' topological modeling capabilities. The architecture consists of Vision GNN blocks stacked into a U-shaped structure. Additionally, we introduce the concept of multi-order similarity and propose a zero-computation-cost approach to incorporate higher-order similarity in graph construction. Each Vision GNN block segments the image into patch nodes, constructs multi-order similarity graphs, and aggregates node features via multi-order node information aggregation. Experimental evaluations on multi-organ and cardiac segmentation datasets demonstrate that U-GNN significantly outperforms existing CNN- and Transformer-based models. U-GNN achieves a 6% improvement in Dice Similarity Coefficient (DSC) and an 18% reduction in Hausdorff Distance (HD) compared to state-of-the-art methods. The source code will be released upon paper acceptance.

SAM-aware Test-time Adaptation for Universal Medical Image Segmentation

Jianghao Wu, Yicheng Wu, Yutong Xie, Wenjia Bai, You Zhang, Feilong Tang, Yulong Li, Yasmeen George, Imran Razzak

arxiv logopreprintJun 5 2025
Universal medical image segmentation using the Segment Anything Model (SAM) remains challenging due to its limited adaptability to medical domains. Existing adaptations, such as MedSAM, enhance SAM's performance in medical imaging but at the cost of reduced generalization to unseen data. Therefore, in this paper, we propose SAM-aware Test-Time Adaptation (SAM-TTA), a fundamentally different pipeline that preserves the generalization of SAM while improving its segmentation performance in medical imaging via a test-time framework. SAM-TTA tackles two key challenges: (1) input-level discrepancies caused by differences in image acquisition between natural and medical images and (2) semantic-level discrepancies due to fundamental differences in object definition between natural and medical domains (e.g., clear boundaries vs. ambiguous structures). Specifically, our SAM-TTA framework comprises (1) Self-adaptive Bezier Curve-based Transformation (SBCT), which adaptively converts single-channel medical images into three-channel SAM-compatible inputs while maintaining structural integrity, to mitigate the input gap between medical and natural images, and (2) Dual-scale Uncertainty-driven Mean Teacher adaptation (DUMT), which employs consistency learning to align SAM's internal representations to medical semantics, enabling efficient adaptation without auxiliary supervision or expensive retraining. Extensive experiments on five public datasets demonstrate that our SAM-TTA outperforms existing TTA approaches and even surpasses fully fine-tuned models such as MedSAM in certain scenarios, establishing a new paradigm for universal medical image segmentation. Code can be found at https://github.com/JianghaoWu/SAM-TTA.
Page 10 of 22215 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.