Sort by:
Page 1 of 11102 results
Next

MedSAM-CA: A CNN-Augmented ViT with Attention-Enhanced Multi-Scale Fusion for Medical Image Segmentation

Peiting Tian, Xi Chen, Haixia Bi, Fan Li

arxiv logopreprintJun 30 2025
Medical image segmentation plays a crucial role in clinical diagnosis and treatment planning, where accurate boundary delineation is essential for precise lesion localization, organ identification, and quantitative assessment. In recent years, deep learning-based methods have significantly advanced segmentation accuracy. However, two major challenges remain. First, the performance of these methods heavily relies on large-scale annotated datasets, which are often difficult to obtain in medical scenarios due to privacy concerns and high annotation costs. Second, clinically challenging scenarios, such as low contrast in certain imaging modalities and blurry lesion boundaries caused by malignancy, still pose obstacles to precise segmentation. To address these challenges, we propose MedSAM-CA, an architecture-level fine-tuning approach that mitigates reliance on extensive manual annotations by adapting the pretrained foundation model, Medical Segment Anything (MedSAM). MedSAM-CA introduces two key components: the Convolutional Attention-Enhanced Boundary Refinement Network (CBR-Net) and the Attention-Enhanced Feature Fusion Block (Atte-FFB). CBR-Net operates in parallel with the MedSAM encoder to recover boundary information potentially overlooked by long-range attention mechanisms, leveraging hierarchical convolutional processing. Atte-FFB, embedded in the MedSAM decoder, fuses multi-level fine-grained features from skip connections in CBR-Net with global representations upsampled within the decoder to enhance boundary delineation accuracy. Experiments on publicly available datasets covering dermoscopy, CT, and MRI imaging modalities validate the effectiveness of MedSAM-CA. On dermoscopy dataset, MedSAM-CA achieves 94.43% Dice with only 2% of full training data, reaching 97.25% of full-data training performance, demonstrating strong effectiveness in low-resource clinical settings.

GUSL: A Novel and Efficient Machine Learning Model for Prostate Segmentation on MRI

Jiaxin Yang, Vasileios Magoulianitis, Catherine Aurelia Christie Alexander, Jintang Xue, Masatomo Kaneko, Giovanni Cacciamani, Andre Abreu, Vinay Duddalwar, C. -C. Jay Kuo, Inderbir S. Gill, Chrysostomos Nikias

arxiv logopreprintJun 30 2025
Prostate and zonal segmentation is a crucial step for clinical diagnosis of prostate cancer (PCa). Computer-aided diagnosis tools for prostate segmentation are based on the deep learning (DL) paradigm. However, deep neural networks are perceived as "black-box" solutions by physicians, thus making them less practical for deployment in the clinical setting. In this paper, we introduce a feed-forward machine learning model, named Green U-shaped Learning (GUSL), suitable for medical image segmentation without backpropagation. GUSL introduces a multi-layer regression scheme for coarse-to-fine segmentation. Its feature extraction is based on a linear model, which enables seamless interpretability during feature extraction. Also, GUSL introduces a mechanism for attention on the prostate boundaries, which is an error-prone region, by employing regression to refine the predictions through residue correction. In addition, a two-step pipeline approach is used to mitigate the class imbalance, an issue inherent in medical imaging problems. After conducting experiments on two publicly available datasets and one private dataset, in both prostate gland and zonal segmentation tasks, GUSL achieves state-of-the-art performance among other DL-based models. Notably, GUSL features a very energy-efficient pipeline, since it has a model size several times smaller and less complexity than the rest of the solutions. In all datasets, GUSL achieved a Dice Similarity Coefficient (DSC) performance greater than $0.9$ for gland segmentation. Considering also its lightweight model size and transparency in feature extraction, it offers a competitive and practical package for medical imaging applications.

Scout-Dose-TCM: Direct and Prospective Scout-Based Estimation of Personalized Organ Doses from Tube Current Modulated CT Exams

Maria Jose Medrano, Sen Wang, Liyan Sun, Abdullah-Al-Zubaer Imran, Jennie Cao, Grant Stevens, Justin Ruey Tse, Adam S. Wang

arxiv logopreprintJun 30 2025
This study proposes Scout-Dose-TCM for direct, prospective estimation of organ-level doses under tube current modulation (TCM) and compares its performance to two established methods. We analyzed contrast-enhanced chest-abdomen-pelvis CT scans from 130 adults (120 kVp, TCM). Reference doses for six organs (lungs, kidneys, liver, pancreas, bladder, spleen) were calculated using MC-GPU and TotalSegmentator. Based on these, we trained Scout-Dose-TCM, a deep learning model that predicts organ doses corresponding to discrete cosine transform (DCT) basis functions, enabling real-time estimates for any TCM profile. The model combines a feature learning module that extracts contextual information from lateral and frontal scouts and scan range with a dose learning module that output DCT-based dose estimates. A customized loss function incorporated the DCT formulation during training. For comparison, we implemented size-specific dose estimation per AAPM TG 204 (Global CTDIvol) and its organ-level TCM-adapted version (Organ CTDIvol). A 5-fold cross-validation assessed generalizability by comparing mean absolute percentage dose errors and r-squared correlations with benchmark doses. Average absolute percentage errors were 13% (Global CTDIvol), 9% (Organ CTDIvol), and 7% (Scout-Dose-TCM), with bladder showing the largest discrepancies (15%, 13%, and 9%). Statistical tests confirmed Scout-Dose-TCM significantly reduced errors vs. Global CTDIvol across most organs and improved over Organ CTDIvol for the liver, bladder, and pancreas. It also achieved higher r-squared values, indicating stronger agreement with Monte Carlo benchmarks. Scout-Dose-TCM outperformed Global CTDIvol and was comparable to or better than Organ CTDIvol, without requiring organ segmentations at inference, demonstrating its promise as a tool for prospective organ-level dose estimation in CT.

Artificial Intelligence-assisted Pixel-level Lung (APL) Scoring for Fast and Accurate Quantification in Ultra-short Echo-time MRI

Bowen Xin, Rohan Hickey, Tamara Blake, Jin Jin, Claire E Wainwright, Thomas Benkert, Alto Stemmer, Peter Sly, David Coman, Jason Dowling

arxiv logopreprintJun 30 2025
Lung magnetic resonance imaging (MRI) with ultrashort echo-time (UTE) represents a recent breakthrough in lung structure imaging, providing image resolution and quality comparable to computed tomography (CT). Due to the absence of ionising radiation, MRI is often preferred over CT in paediatric diseases such as cystic fibrosis (CF), one of the most common genetic disorders in Caucasians. To assess structural lung damage in CF imaging, CT scoring systems provide valuable quantitative insights for disease diagnosis and progression. However, few quantitative scoring systems are available in structural lung MRI (e.g., UTE-MRI). To provide fast and accurate quantification in lung MRI, we investigated the feasibility of novel Artificial intelligence-assisted Pixel-level Lung (APL) scoring for CF. APL scoring consists of 5 stages, including 1) image loading, 2) AI lung segmentation, 3) lung-bounded slice sampling, 4) pixel-level annotation, and 5) quantification and reporting. The results shows that our APL scoring took 8.2 minutes per subject, which was more than twice as fast as the previous grid-level scoring. Additionally, our pixel-level scoring was statistically more accurate (p=0.021), while strongly correlating with grid-level scoring (R=0.973, p=5.85e-9). This tool has great potential to streamline the workflow of UTE lung MRI in clinical settings, and be extended to other structural lung MRI sequences (e.g., BLADE MRI), and for other lung diseases (e.g., bronchopulmonary dysplasia).

MedRegion-CT: Region-Focused Multimodal LLM for Comprehensive 3D CT Report Generation

Sunggu Kyung, Jinyoung Seo, Hyunseok Lim, Dongyeong Kim, Hyungbin Park, Jimin Sung, Jihyun Kim, Wooyoung Jo, Yoojin Nam, Namkug Kim

arxiv logopreprintJun 29 2025
The recent release of RadGenome-Chest CT has significantly advanced CT-based report generation. However, existing methods primarily focus on global features, making it challenging to capture region-specific details, which may cause certain abnormalities to go unnoticed. To address this, we propose MedRegion-CT, a region-focused Multi-Modal Large Language Model (MLLM) framework, featuring three key innovations. First, we introduce Region Representative ($R^2$) Token Pooling, which utilizes a 2D-wise pretrained vision model to efficiently extract 3D CT features. This approach generates global tokens representing overall slice features and region tokens highlighting target areas, enabling the MLLM to process comprehensive information effectively. Second, a universal segmentation model generates pseudo-masks, which are then processed by a mask encoder to extract region-centric features. This allows the MLLM to focus on clinically relevant regions, using six predefined region masks. Third, we leverage segmentation results to extract patient-specific attributions, including organ size, diameter, and locations. These are converted into text prompts, enriching the MLLM's understanding of patient-specific contexts. To ensure rigorous evaluation, we conducted benchmark experiments on report generation using the RadGenome-Chest CT. MedRegion-CT achieved state-of-the-art performance, outperforming existing methods in natural language generation quality and clinical relevance while maintaining interpretability. The code for our framework is publicly available.

CA-Diff: Collaborative Anatomy Diffusion for Brain Tissue Segmentation

Qilong Xing, Zikai Song, Yuteng Ye, Yuke Chen, Youjia Zhang, Na Feng, Junqing Yu, Wei Yang

arxiv logopreprintJun 28 2025
Segmentation of brain structures from MRI is crucial for evaluating brain morphology, yet existing CNN and transformer-based methods struggle to delineate complex structures accurately. While current diffusion models have shown promise in image segmentation, they are inadequate when applied directly to brain MRI due to neglecting anatomical information. To address this, we propose Collaborative Anatomy Diffusion (CA-Diff), a framework integrating spatial anatomical features to enhance segmentation accuracy of the diffusion model. Specifically, we introduce distance field as an auxiliary anatomical condition to provide global spatial context, alongside a collaborative diffusion process to model its joint distribution with anatomical structures, enabling effective utilization of anatomical features for segmentation. Furthermore, we introduce a consistency loss to refine relationships between the distance field and anatomical structures and design a time adapted channel attention module to enhance the U-Net feature fusion procedure. Extensive experiments show that CA-Diff outperforms state-of-the-art (SOTA) methods.

Inpainting is All You Need: A Diffusion-based Augmentation Method for Semi-supervised Medical Image Segmentation

Xinrong Hu, Yiyu Shi

arxiv logopreprintJun 28 2025
Collecting pixel-level labels for medical datasets can be a laborious and expensive process, and enhancing segmentation performance with a scarcity of labeled data is a crucial challenge. This work introduces AugPaint, a data augmentation framework that utilizes inpainting to generate image-label pairs from limited labeled data. AugPaint leverages latent diffusion models, known for their ability to generate high-quality in-domain images with low overhead, and adapts the sampling process for the inpainting task without need for retraining. Specifically, given a pair of image and label mask, we crop the area labeled with the foreground and condition on it during reversed denoising process for every noise level. Masked background area would gradually be filled in, and all generated images are paired with the label mask. This approach ensures the accuracy of match between synthetic images and label masks, setting it apart from existing dataset generation methods. The generated images serve as valuable supervision for training downstream segmentation models, effectively addressing the challenge of limited annotations. We conducted extensive evaluations of our data augmentation method on four public medical image segmentation datasets, including CT, MRI, and skin imaging. Results across all datasets demonstrate that AugPaint outperforms state-of-the-art label-efficient methodologies, significantly improving segmentation performance.

High Resolution Isotropic 3D Cine imaging with Automated Segmentation using Concatenated 2D Real-time Imaging and Deep Learning

Mark Wrobel, Michele Pascale, Tina Yao, Ruaraidh Campbell, Elena Milano, Michael Quail, Jennifer Steeden, Vivek Muthurangu

arxiv logopreprintJun 27 2025
Background: Conventional cardiovascular magnetic resonance (CMR) in paediatric and congenital heart disease uses 2D, breath-hold, balanced steady state free precession (bSSFP) cine imaging for assessment of function and cardiac-gated, respiratory-navigated, static 3D bSSFP whole-heart imaging for anatomical assessment. Our aim is to concatenate a stack 2D free-breathing real-time cines and use Deep Learning (DL) to create an isotropic a fully segmented 3D cine dataset from these images. Methods: Four DL models were trained on open-source data that performed: a) Interslice contrast correction; b) Interslice respiratory motion correction; c) Super-resolution (slice direction); and d) Segmentation of right and left atria and ventricles (RA, LA, RV, and LV), thoracic aorta (Ao) and pulmonary arteries (PA). In 10 patients undergoing routine cardiovascular examination, our method was validated on prospectively acquired sagittal stacks of real-time cine images. Quantitative metrics (ventricular volumes and vessel diameters) and image quality of the 3D cines were compared to conventional breath hold cine and whole heart imaging. Results: All real-time data were successfully transformed into 3D cines with a total post-processing time of <1 min in all cases. There were no significant biases in any LV or RV metrics with reasonable limits of agreement and correlation. There is also reasonable agreement for all vessel diameters, although there was a small but significant overestimation of RPA diameter. Conclusion: We have demonstrated the potential of creating a 3D-cine data from concatenated 2D real-time cine images using a series of DL models. Our method has short acquisition and reconstruction times with fully segmented data being available within 2 minutes. The good agreement with conventional imaging suggests that our method could help to significantly speed up CMR in clinical practice.

Noise-Inspired Diffusion Model for Generalizable Low-Dose CT Reconstruction

Qi Gao, Zhihao Chen, Dong Zeng, Junping Zhang, Jianhua Ma, Hongming Shan

arxiv logopreprintJun 27 2025
The generalization of deep learning-based low-dose computed tomography (CT) reconstruction models to doses unseen in the training data is important and remains challenging. Previous efforts heavily rely on paired data to improve the generalization performance and robustness through collecting either diverse CT data for re-training or a few test data for fine-tuning. Recently, diffusion models have shown promising and generalizable performance in low-dose CT (LDCT) reconstruction, however, they may produce unrealistic structures due to the CT image noise deviating from Gaussian distribution and imprecise prior information from the guidance of noisy LDCT images. In this paper, we propose a noise-inspired diffusion model for generalizable LDCT reconstruction, termed NEED, which tailors diffusion models for noise characteristics of each domain. First, we propose a novel shifted Poisson diffusion model to denoise projection data, which aligns the diffusion process with the noise model in pre-log LDCT projections. Second, we devise a doubly guided diffusion model to refine reconstructed images, which leverages LDCT images and initial reconstructions to more accurately locate prior information and enhance reconstruction fidelity. By cascading these two diffusion models for dual-domain reconstruction, our NEED requires only normal-dose data for training and can be effectively extended to various unseen dose levels during testing via a time step matching strategy. Extensive qualitative, quantitative, and segmentation-based evaluations on two datasets demonstrate that our NEED consistently outperforms state-of-the-art methods in reconstruction and generalization performance. Source code is made available at https://github.com/qgao21/NEED.

Robust Deep Learning for Myocardial Scar Segmentation in Cardiac MRI with Noisy Labels

Aida Moafi, Danial Moafi, Evgeny M. Mirkes, Gerry P. McCann, Abbas S. Alatrany, Jayanth R. Arnold, Mostafa Mehdipour Ghazi

arxiv logopreprintJun 26 2025
The accurate segmentation of myocardial scars from cardiac MRI is essential for clinical assessment and treatment planning. In this study, we propose a robust deep-learning pipeline for fully automated myocardial scar detection and segmentation by fine-tuning state-of-the-art models. The method explicitly addresses challenges of label noise from semi-automatic annotations, data heterogeneity, and class imbalance through the use of Kullback-Leibler loss and extensive data augmentation. We evaluate the model's performance on both acute and chronic cases and demonstrate its ability to produce accurate and smooth segmentations despite noisy labels. In particular, our approach outperforms state-of-the-art models like nnU-Net and shows strong generalizability in an out-of-distribution test set, highlighting its robustness across various imaging conditions and clinical tasks. These results establish a reliable foundation for automated myocardial scar quantification and support the broader clinical adoption of deep learning in cardiac imaging.
Page 1 of 11102 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.