Disentangled generative uncertainty-aware multi-modal diffusion segmentation of medical images.
Authors
Affiliations (3)
Affiliations (3)
- Khalifa University, Abu Dhabi, United Arab Emirates; Faculty of IT, Monash University, Melbourne, Australia. Electronic address: [email protected].
- Jio University, Navi Mumbai, India.
- ARTORG Center for Biomedical Engineering Research, University of Bern, Bern, Switzerland.
Abstract
The full clinical integration of deep learning models for medical image segmentation is significantly impeded by their inherent lack of transparency and explicit uncertainty quantification (UQ). In high-stakes medical contexts, clinicians demand not only precise segmentations but also explicit confidence measures, particularly within multi-modal imaging scenarios where effectively integrating diverse data streams while robustly quantifying their inherent uncertainties remains a profound challenge for trustworthy AI. This paper introduces a novel framework, Disentangled Generative Uncertainty-Aware Multi-Modal Diffusion Segmentation (D-GUMM-DS), engineered for robust multi-modal medical image segmentation. Our approach uniquely leverages Generative Artificial Intelligence (GenAI) approaches, specifically Denoising Diffusion Probabilistic Models (DDPMs), to inherently learn the underlying probability distribution of the data. Unlike traditional methods that apply UQ post-hoc to deterministic outputs, D-GUMM-DS directly integrates GenAI's probabilistic nature into a disentangled, adaptive, and uncertainty-aware fusion mechanism. This mechanism intelligently combines multi-modal features, dynamically adjusting their influence based on relevance and resolving ambiguities from data heterogeneity. By analyzing the divergence among multiple plausible segmentation samples generated from the model's learned distribution, we reliably derive comprehensive pixel-wise and global uncertainty estimates. We demonstrate that this principled generative paradigm yields highly accurate and robust segmentations, concurrently providing well-calibrated and clinically interpretable uncertainty maps, thereby fostering greater trust and significantly enhancing decision support in AI-driven medical image analysis.