Towards trustworthy AI in radiotherapy: a comprehensive review of uncertainty-aware techniques.
Authors
Affiliations (2)
Affiliations (2)
- Univ. Rennes, CLCC Eugène Marquis, INSERM, UMR 1099, LTSI, F-35000 Rennes, France, Rennes, Brittany, 35042, FRANCE.
- Univ. Rennes, CLCC Eugène Marquis, INSERM - UMR 1099, LTSI, F-35000 Rennes, France, Rennes, Brittany, 35042, FRANCE.
Abstract
Uncertainty quantification (UQ) has emerged as a crucial component in deep learning-based medical image analysis, particularly in radiotherapy (RT). Addressing uncertainty is essential for improving the reliability, interpretability, and clinical applicability of AI-driven models in key radiotherapy tasks, including segmentation, image registration, synthetic image generation, dose prediction and dose accumulation. Despite significant advancements, challenges remain in integrating UQ techniques into RT clinical workflows.
Purpose: This review synthesizes recent developments in UQ methods applied to RT. It introduces a structured classification of UQ techniques, evaluates their impact on clinical workflows, and highlights emerging trends from studies published from 2020 to 2025. 
Methods: A systematic search was conducted on PubMed and Google Scholar for articles published from January 2020 to June 2025. Keywords included "uncertainty", "radiotherapy", and task-specific terms such as "segmentation", "registration", "synthetic image generation", "image-to-image translation", "dose prediction", or "dose accumulation". Studies were classified based on the type of uncertainty estimation technique, imaging modality, and associated RT task. 
Results: Segmentation emerged as the most common RT task addressed by UQ methods, followed by image registration, synthetic image generation and dose prediction. Probabilistic techniques such as Bayesian neural networks, Monte Carlo dropout, and ensemble learning, dominate the field, particularly for modeling epistemic uncertainty. Studies demonstrated that uncertainty maps enhance model interpretability, guide clinical review of auto-segmentations, and support quality assurance processes. 
Conclusion: UQ has the potential to enhance the robustness of AI-driven RT workflows. While substantial progress has been made, further efforts are needed to standardize evaluation protocols, improve computational efficiency, and develop user-friendly interfaces for clinical integration. Future research should aim to close the gap between technical advances and their clinical deployment to ensure uncertainty-aware models contribute effectively to personalized RT.