Uncertainty quantification of U-Net based segmentation tool using conformal prediction.
Authors
Affiliations (2)
Affiliations (2)
- The University of British Columbia - Okanagan Campus, Kelowna, British Columbia, Canada.
- BC Cancer - Kelowna, Kelowna, British Columbia, Canada.
Abstract
The radiation therapy treatment process is very labour intensive, and artificial intelligence (AI) based auto contouring tools are increasingly being adopted to improve efficiency. However, current acceptance testing of AI auto contouring algorithms relies primarily on area- and distance-based metrics, with limited assessment of model uncertainty. To demonstrate conformal prediction as a complementary error analysis technique for AI auto contouring algorithms, providing spatially localized uncertainty information that traditional metrics do not capture. A U-Net architecture with a ResNet-34 encoder was trained on BC Cancer breast data to segment the left lung, right lung, and the heart. Initial testing was performed on a subset of 376 computed tomography (CT) scans using both area-based (IoU) and distance-based (HD95) metrics. Conformal prediction using adaptive prediction sets was then performed on 138 CT scans. The change in the derivative of the intersection over union (IoU) between the original predictions and the conformal predictions was observed with respect to the selected confidence level. U-Net achieved a mean IoU of 0.924 and a mean HD95 of 11.35. When conformal prediction was applied using a 90% confidence threshold, the percent differences between the conformal prediction IoUs and the U-Net prediction IoUs were 1.01%, 0.89%, and 1.46% for the left lung, right lung, and the heart, respectively. The IoU derivatives differed significantly between true positive and false positive structure predictions (P < 0.001), indicating that conformal prediction can distinguish reliable predictions from unreliable ones. Conformal prediction provides an additional tool for acceptance testing of AI auto contouring algorithms. Beyond traditional area- and distance-based metrics, it spatially localizes uncertain predictions and offers a mechanism for identifying false positives.