Back to all papers

Interpreting convolutional neural network explainability for head-and-neck cancer radiotherapy organ-at-risk segmentation.

Authors

Strijbis VIJ,Gurney-Champion OJ,Grama DI,Slotman BJ,Verbakel WFAR

Affiliations (5)

  • Amsterdam UMC Location Vrije Universiteit Amsterdam, Department of Radiation Oncology, De Boelelaan 1117, 1081 HV, Amsterdam, the Netherlands; Cancer Center Amsterdam, Cancer Treatment and Quality of Life, De Boelelaan 1118, 1081 HV, Amsterdam, the Netherlands. Electronic address: [email protected].
  • Amsterdam UMC Location University of Amsterdam, Department of Radiology and Nuclear Medicine, Meibergdreef 9, 1105 AZ, Amsterdam, the Netherlands; Cancer Center Amsterdam, Imaging and Biomarkers, De Boelelaan 1118, 1081 HV, Amsterdam, the Netherlands. Electronic address: [email protected].
  • Amsterdam UMC Location Vrije Universiteit Amsterdam, Department of Radiation Oncology, De Boelelaan 1117, 1081 HV, Amsterdam, the Netherlands; Cancer Center Amsterdam, Cancer Treatment and Quality of Life, De Boelelaan 1118, 1081 HV, Amsterdam, the Netherlands. Electronic address: [email protected].
  • Amsterdam UMC Location Vrije Universiteit Amsterdam, Department of Radiation Oncology, De Boelelaan 1117, 1081 HV, Amsterdam, the Netherlands; Cancer Center Amsterdam, Cancer Treatment and Quality of Life, De Boelelaan 1118, 1081 HV, Amsterdam, the Netherlands. Electronic address: [email protected].
  • Amsterdam UMC Location Vrije Universiteit Amsterdam, Department of Radiation Oncology, De Boelelaan 1117, 1081 HV, Amsterdam, the Netherlands; Cancer Center Amsterdam, Cancer Treatment and Quality of Life, De Boelelaan 1118, 1081 HV, Amsterdam, the Netherlands; Varian Medical Systems, a Siemens Healthineers Company, 3100 Hansen Way, Palo Alto, CA 94304, United States of America. Electronic address: [email protected].

Abstract

Convolutional neural networks (CNNs) have emerged to reduce clinical resources and standardize auto-contouring of organs-at-risk (OARs). Although CNNs perform adequately for most patients, understanding when the CNN might fail is critical for effective and safe clinical deployment. However, the limitations of CNNs are poorly understood because of their black-box nature. Explainable artificial intelligence (XAI) can expose CNNs' inner mechanisms for classification. Here, we investigate the inner mechanisms of CNNs for segmentation and explore a novel, computational approach to a-priori flag potentially insufficient parotid gland (PG) contours. First, 3D UNets were trained in three PG segmentation situations using (1) synthetic cases; (2) 1925 clinical computed tomography (CT) scans with typical and (3) more consistent contours curated through a previously validated auto-curation step. Then, we generated attribution maps for seven XAI methods, and qualitatively assessed them for congruency between simulated and clinical contours, and how much XAI agreed with expert reasoning. To objectify observations, we explored persistent homology intensity filtrations to capture essential topological characteristics of XAI attributions. Principal component (PC) eigenvalues of Euler characteristic profiles were correlated with spatial agreement (Dice-Sørensen similarity coefficient; DSC). Evaluation was done using sensitivity, specificity and the area under receiver operating characteristic (AUROC) curve on an external AAPM dataset, where as proof-of-principle, we regard the lowest 15% DSC as insufficient. PatternNet attributions (PNet-A) focused on soft-tissue structures, whereas guided backpropagation (GBP) highlighted both soft-tissue and high-density structures (e.g. mandible bone), which was congruent with synthetic situations. Both methods typically had higher/denser activations in better auto-contoured medial and anterior lobes. Curated models produced "cleaner" gradient class-activation mapping (GCAM) attributions. Quantitative analysis showed that PCλ<sub>1</sub> of guided GCAM's (GGCAM) Euler characteristic (EC) profile had good predictive value (sensitivity>0.85, specificity>0.90) of DSC for AAPM cases, with AUROC = 0.66, 0.74, 0.94, 0.83 for GBP, GCAM, GGCAM and PNet-A. For for λ<sub>1</sub> < -1.8e3 of GGCAM's EC-profile, 87% of cases were insufficient. GBP and PNet-A qualitatively agreed most with expert reasoning on directly (structure borders) and indirectly (proxies used for identifying structure borders) important features for PG segmentation. Additionally, this work investigated as proof-of-principle how topological data analysis could be used for quantitative XAI signal analysis to a-priori mark potentially inadequate CNN-segmentations, using only features from inside the predicted PG. This work used PG as a well-understood segmentation paradigm and may extend to target volumes and other organs-at-risk.

Topics

Journal Article

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.