Sort by:
Page 5 of 42411 results

MDPG: Multi-domain Diffusion Prior Guidance for MRI Reconstruction

Lingtong Zhang, Mengdie Song, Xiaohan Hao, Huayu Mai, Bensheng Qiu

arxiv logopreprintJun 30 2025
Magnetic Resonance Imaging (MRI) reconstruction is essential in medical diagnostics. As the latest generative models, diffusion models (DMs) have struggled to produce high-fidelity images due to their stochastic nature in image domains. Latent diffusion models (LDMs) yield both compact and detailed prior knowledge in latent domains, which could effectively guide the model towards more effective learning of the original data distribution. Inspired by this, we propose Multi-domain Diffusion Prior Guidance (MDPG) provided by pre-trained LDMs to enhance data consistency in MRI reconstruction tasks. Specifically, we first construct a Visual-Mamba-based backbone, which enables efficient encoding and reconstruction of under-sampled images. Then pre-trained LDMs are integrated to provide conditional priors in both latent and image domains. A novel Latent Guided Attention (LGA) is proposed for efficient fusion in multi-level latent domains. Simultaneously, to effectively utilize a prior in both the k-space and image domain, under-sampled images are fused with generated full-sampled images by the Dual-domain Fusion Branch (DFB) for self-adaption guidance. Lastly, to further enhance the data consistency, we propose a k-space regularization strategy based on the non-auto-calibration signal (NACS) set. Extensive experiments on two public MRI datasets fully demonstrate the effectiveness of the proposed methodology. The code is available at https://github.com/Zolento/MDPG.

MedSAM-CA: A CNN-Augmented ViT with Attention-Enhanced Multi-Scale Fusion for Medical Image Segmentation

Peiting Tian, Xi Chen, Haixia Bi, Fan Li

arxiv logopreprintJun 30 2025
Medical image segmentation plays a crucial role in clinical diagnosis and treatment planning, where accurate boundary delineation is essential for precise lesion localization, organ identification, and quantitative assessment. In recent years, deep learning-based methods have significantly advanced segmentation accuracy. However, two major challenges remain. First, the performance of these methods heavily relies on large-scale annotated datasets, which are often difficult to obtain in medical scenarios due to privacy concerns and high annotation costs. Second, clinically challenging scenarios, such as low contrast in certain imaging modalities and blurry lesion boundaries caused by malignancy, still pose obstacles to precise segmentation. To address these challenges, we propose MedSAM-CA, an architecture-level fine-tuning approach that mitigates reliance on extensive manual annotations by adapting the pretrained foundation model, Medical Segment Anything (MedSAM). MedSAM-CA introduces two key components: the Convolutional Attention-Enhanced Boundary Refinement Network (CBR-Net) and the Attention-Enhanced Feature Fusion Block (Atte-FFB). CBR-Net operates in parallel with the MedSAM encoder to recover boundary information potentially overlooked by long-range attention mechanisms, leveraging hierarchical convolutional processing. Atte-FFB, embedded in the MedSAM decoder, fuses multi-level fine-grained features from skip connections in CBR-Net with global representations upsampled within the decoder to enhance boundary delineation accuracy. Experiments on publicly available datasets covering dermoscopy, CT, and MRI imaging modalities validate the effectiveness of MedSAM-CA. On dermoscopy dataset, MedSAM-CA achieves 94.43% Dice with only 2% of full training data, reaching 97.25% of full-data training performance, demonstrating strong effectiveness in low-resource clinical settings.

FD-DiT: Frequency Domain-Directed Diffusion Transformer for Low-Dose CT Reconstruction

Qiqing Liu, Guoquan Wei, Zekun Zhou, Yiyang Wen, Liu Shi, Qiegen Liu

arxiv logopreprintJun 30 2025
Low-dose computed tomography (LDCT) reduces radiation exposure but suffers from image artifacts and loss of detail due to quantum and electronic noise, potentially impacting diagnostic accuracy. Transformer combined with diffusion models has been a promising approach for image generation. Nevertheless, existing methods exhibit limitations in preserving finegrained image details. To address this issue, frequency domain-directed diffusion transformer (FD-DiT) is proposed for LDCT reconstruction. FD-DiT centers on a diffusion strategy that progressively introduces noise until the distribution statistically aligns with that of LDCT data, followed by denoising processing. Furthermore, we employ a frequency decoupling technique to concentrate noise primarily in high-frequency domain, thereby facilitating effective capture of essential anatomical structures and fine details. A hybrid denoising network is then utilized to optimize the overall data reconstruction process. To enhance the capability in recognizing high-frequency noise, we incorporate sliding sparse local attention to leverage the sparsity and locality of shallow-layer information, propagating them via skip connections for improving feature representation. Finally, we propose a learnable dynamic fusion strategy for optimal component integration. Experimental results demonstrate that at identical dose levels, LDCT images reconstructed by FD-DiT exhibit superior noise and artifact suppression compared to state-of-the-art methods.

Contrastive Learning with Diffusion Features for Weakly Supervised Medical Image Segmentation

Dewen Zeng, Xinrong Hu, Yu-Jen Chen, Yawen Wu, Xiaowei Xu, Yiyu Shi

arxiv logopreprintJun 30 2025
Weakly supervised semantic segmentation (WSSS) methods using class labels often rely on class activation maps (CAMs) to localize objects. However, traditional CAM-based methods struggle with partial activations and imprecise object boundaries due to optimization discrepancies between classification and segmentation. Recently, the conditional diffusion model (CDM) has been used as an alternative for generating segmentation masks in WSSS, leveraging its strong image generation capabilities tailored to specific class distributions. By modifying or perturbing the condition during diffusion sampling, the related objects can be highlighted in the generated images. Yet, the saliency maps generated by CDMs are prone to noise from background alterations during reverse diffusion. To alleviate the problem, we introduce Contrastive Learning with Diffusion Features (CLDF), a novel method that uses contrastive learning to train a pixel decoder to map the diffusion features from a frozen CDM to a low-dimensional embedding space for segmentation. Specifically, we integrate gradient maps generated from CDM external classifier with CAMs to identify foreground and background pixels with fewer false positives/negatives for contrastive learning, enabling robust pixel embedding learning. Experimental results on four segmentation tasks from two public medical datasets demonstrate that our method significantly outperforms existing baselines.

Exposing and Mitigating Calibration Biases and Demographic Unfairness in MLLM Few-Shot In-Context Learning for Medical Image Classification

Xing Shen, Justin Szeto, Mingyang Li, Hengguan Huang, Tal Arbel

arxiv logopreprintJun 29 2025
Multimodal large language models (MLLMs) have enormous potential to perform few-shot in-context learning in the context of medical image analysis. However, safe deployment of these models into real-world clinical practice requires an in-depth analysis of the accuracies of their predictions, and their associated calibration errors, particularly across different demographic subgroups. In this work, we present the first investigation into the calibration biases and demographic unfairness of MLLMs' predictions and confidence scores in few-shot in-context learning for medical image classification. We introduce CALIN, an inference-time calibration method designed to mitigate the associated biases. Specifically, CALIN estimates the amount of calibration needed, represented by calibration matrices, using a bi-level procedure: progressing from the population level to the subgroup level prior to inference. It then applies this estimation to calibrate the predicted confidence scores during inference. Experimental results on three medical imaging datasets: PAPILA for fundus image classification, HAM10000 for skin cancer classification, and MIMIC-CXR for chest X-ray classification demonstrate CALIN's effectiveness at ensuring fair confidence calibration in its prediction, while improving its overall prediction accuracies and exhibiting minimum fairness-utility trade-off.

Inpainting is All You Need: A Diffusion-based Augmentation Method for Semi-supervised Medical Image Segmentation

Xinrong Hu, Yiyu Shi

arxiv logopreprintJun 28 2025
Collecting pixel-level labels for medical datasets can be a laborious and expensive process, and enhancing segmentation performance with a scarcity of labeled data is a crucial challenge. This work introduces AugPaint, a data augmentation framework that utilizes inpainting to generate image-label pairs from limited labeled data. AugPaint leverages latent diffusion models, known for their ability to generate high-quality in-domain images with low overhead, and adapts the sampling process for the inpainting task without need for retraining. Specifically, given a pair of image and label mask, we crop the area labeled with the foreground and condition on it during reversed denoising process for every noise level. Masked background area would gradually be filled in, and all generated images are paired with the label mask. This approach ensures the accuracy of match between synthetic images and label masks, setting it apart from existing dataset generation methods. The generated images serve as valuable supervision for training downstream segmentation models, effectively addressing the challenge of limited annotations. We conducted extensive evaluations of our data augmentation method on four public medical image segmentation datasets, including CT, MRI, and skin imaging. Results across all datasets demonstrate that AugPaint outperforms state-of-the-art label-efficient methodologies, significantly improving segmentation performance.

Causality-Adjusted Data Augmentation for Domain Continual Medical Image Segmentation.

Zhu Z, Dong Q, Luo G, Wang W, Dong S, Wang K, Tian Y, Wang G, Li S

pubmed logopapersJun 27 2025
In domain continual medical image segmentation, distillation-based methods mitigate catastrophic forgetting by continuously reviewing old knowledge. However, these approaches often exhibit biases towards both new and old knowledge simultaneously due to confounding factors, which can undermine segmentation performance. To address these biases, we propose the Causality-Adjusted Data Augmentation (CauAug) framework, introducing a novel causal intervention strategy called the Texture-Domain Adjustment Hybrid-Scheme (TDAHS) alongside two causality-targeted data augmentation approaches: the Cross Kernel Network (CKNet) and the Fourier Transformer Generator (FTGen). (1) TDAHS establishes a domain-continual causal model that accounts for two types of knowledge biases by identifying irrelevant local textures (L) and domain-specific features (D) as confounders. It introduces a hybrid causal intervention that combines traditional confounder elimination with a proposed replacement approach to better adapt to domain shifts, thereby promoting causal segmentation. (2) CKNet eliminates confounder L to reduce biases in new knowledge absorption. It decreases reliance on local textures in input images, forcing the model to focus on relevant anatomical structures and thus improving generalization. (3) FTGen causally intervenes on confounder D by selectively replacing it to alleviate biases that impact old knowledge retention. It restores domain-specific features in images, aiding in the comprehensive distillation of old knowledge. Our experiments show that CauAug significantly mitigates catastrophic forgetting and surpasses existing methods in various medical image segmentation tasks. The implementation code is publicly available at: https://github.com/PerceptionComputingLab/CauAug_DCMIS.

<sup>Advanced glaucoma disease segmentation and classification with grey wolf optimized U</sup> <sup>-Net++ and capsule networks</sup>.

Govindharaj I, Deva Priya W, Soujanya KLS, Senthilkumar KP, Shantha Shalini K, Ravichandran S

pubmed logopapersJun 27 2025
Early detection of glaucoma represents a vital factor in securing vision while the disease retains its position as one of the central causes of blindness worldwide. The current glaucoma screening strategies with expert interpretation depend on complex and time-consuming procedures which slow down both diagnosis processes and intervention timing. This research adopts a complex automated glaucoma diagnostic system that combines optimized segmentation solutions together with classification platforms. The proposed segmentation approach implements an enhanced version of U-Net++ using dynamic parameter control provided by GWO to segment optic disc and cup regions in retinal fundus images. Through the implementation of GWO the algorithm uses wolf-pack hunting strategies to adjust parameters dynamically which enables it to locate diverse textural patterns inside images. The system uses a CapsNet capsule network for classification because it maintains visual spatial organization to detect glaucoma-related patterns precisely. The developed system secures an evaluation accuracy of 95.1% in segmentation and classification tasks better than typical approaches. The automated system eliminates and enhances clinical diagnostic speed as well as diagnostic precision. The tool stands out because of its supreme detection accuracy and reliability thus making it an essential clinical early-stage glaucoma diagnostic system and a scalable healthcare deployment solution. To develop an advanced automated glaucoma diagnostic system by integrating an optimized U-Net++ segmentation model with a Capsule Network (CapsNet) classifier, enhanced through Grey Wolf Optimization Algorithm (GWOA), for precise segmentation of optic disc and cup regions and accurate glaucoma classification from retinal fundus images. This study proposes a two-phase computer-assisted diagnosis (CAD) framework. In the segmentation phase, an enhanced U-Net++ model, optimized by GWOA, is employed to accurately delineate the optic disc and cup regions in fundus images. The optimization dynamically tunes hyperparameters based on grey wolf hunting behavior for improved segmentation precision. In the classification phase, a CapsNet architecture is used to maintain spatial hierarchies and effectively classify images as glaucomatous or normal based on segmented outputs. The performance of the proposed model was validated using the ORIGA retinal fundus image dataset, and evaluated against conventional approaches. The proposed GWOA-UNet++ and CapsNet framework achieved a segmentation and classification accuracy of 95.1%, outperforming existing benchmark models such as MTA-CS, ResFPN-Net, DAGCN, MRSNet and AGCT. The model demonstrated robustness against image irregularities, including variations in optic disc size and fundus image quality, and showed superior performance across accuracy, sensitivity, specificity, precision, and F1-score metrics. The developed automated glaucoma detection system exhibits enhanced diagnostic accuracy, efficiency, and reliability, offering significant potential for early-stage glaucoma detection and clinical decision support. Future work will involve large-scale multi-ethnic dataset validation, integration with clinical workflows, and deployment as a mobile or cloud-based screening tool.

Photon-counting micro-CT scanner for deep learning-enabled small animal perfusion imaging.

Allphin AJ, Nadkarni R, Clark DP, Badea CT

pubmed logopapersJun 27 2025
In this work, we introduce a benchtop, turn-table photon-counting (PC) micro-CT scanner and highlight its application for dynamic small animal perfusion imaging.&#xD;Approach: Built on recently published hardware, the system now features a CdTe-based photon-counting detector (PCD). We validated its static spectral PC micro-CT imaging using conventional phantoms and assessed dynamic performance with a custom flow-configurable dual-compartment perfusion phantom. The phantom was scanned under varied flow conditions during injections of a low molecular weight iodinated contrast agent. In vivo mouse studies with identical injection settings demonstrated potential applications. A pretrained denoising CNN processed large multi-energy, temporal datasets (20 timepoints × 4 energies × 3 spatial dimensions), reconstructed via weighted filtered back projection. A separate CNN, trained on simulated data, performed gamma variate-based 2D perfusion mapping, evaluated qualitatively in phantom and in vivo tests.&#xD;Main Results: Full five-dimensional reconstructions were denoised using a CNN in ~3% of the time of iterative reconstruction, reducing noise in water at the highest energy threshold from 1206 HU to 86 HU. Decomposed iodine maps, which improved contrast to noise ratio from 16.4 (in the lowest energy CT images) to 29.4 (in the iodine maps), were used for perfusion analysis. The perfusion CNN outperformed pixelwise gamma variate fitting by ~33%, with a test set error of 0.04 vs. 0.06 in blood flow index (BFI) maps, and quantified linear BFI changes in the phantom with a coefficient of determination of 0.98.&#xD;Significance: This work underscores the PC micro-CT scanner's utility for high-throughput small animal perfusion imaging, leveraging spectral PC micro-CT and iodine decomposition. It provides a versatile platform for preclinical vascular research and advanced, time-resolved studies of disease models and therapeutic interventions.

AI Model Passport: Data and System Traceability Framework for Transparent AI in Health

Varvara Kalokyri, Nikolaos S. Tachos, Charalampos N. Kalantzopoulos, Stelios Sfakianakis, Haridimos Kondylakis, Dimitrios I. Zaridis, Sara Colantonio, Daniele Regge, Nikolaos Papanikolaou, The ProCAncer-I consortium, Konstantinos Marias, Dimitrios I. Fotiadis, Manolis Tsiknakis

arxiv logopreprintJun 27 2025
The increasing integration of Artificial Intelligence (AI) into health and biomedical systems necessitates robust frameworks for transparency, accountability, and ethical compliance. Existing frameworks often rely on human-readable, manual documentation which limits scalability, comparability, and machine interpretability across projects and platforms. They also fail to provide a unique, verifiable identity for AI models to ensure their provenance and authenticity across systems and use cases, limiting reproducibility and stakeholder trust. This paper introduces the concept of the AI Model Passport, a structured and standardized documentation framework that acts as a digital identity and verification tool for AI models. It captures essential metadata to uniquely identify, verify, trace and monitor AI models across their lifecycle - from data acquisition and preprocessing to model design, development and deployment. In addition, an implementation of this framework is presented through AIPassport, an MLOps tool developed within the ProCAncer-I EU project for medical imaging applications. AIPassport automates metadata collection, ensures proper versioning, decouples results from source scripts, and integrates with various development environments. Its effectiveness is showcased through a lesion segmentation use case using data from the ProCAncer-I dataset, illustrating how the AI Model Passport enhances transparency, reproducibility, and regulatory readiness while reducing manual effort. This approach aims to set a new standard for fostering trust and accountability in AI-driven healthcare solutions, aspiring to serve as the basis for developing transparent and regulation compliant AI systems across domains.
Page 5 of 42411 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.