Performance of a screening-trained DL model for pulmonary nodule malignancy estimation of incidental clinical nodules.

Authors

Dinnessen R,Peeters D,Antonissen N,Mohamed Hoesein FAA,Gietema HA,Scholten ET,Schaefer-Prokop C,Jacobs C

Affiliations (6)

  • Department of Medical Imaging, Radboud University Medical Center, Nijmegen, The Netherlands. [email protected].
  • Department of Medical Imaging, Radboud University Medical Center, Nijmegen, The Netherlands.
  • Department of Radiology, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands.
  • Department of Radiology and Nuclear Medicine, Maastricht University Medical Center, Maastricht University, Maastricht, The Netherlands.
  • Maastricht University, GROW, School of Oncology and Reproduction, Maastricht, The Netherlands.
  • Department of Radiology, Meander Medical Center, Amersfoort, The Netherlands.

Abstract

To test the performance of a DL model developed and validated for screen-detected pulmonary nodules on incidental nodules detected in a clinical setting. A retrospective dataset of incidental pulmonary nodules sized 5-15 mm was collected, and a subset of size-matched solid nodules was selected. The performance of the DL model was compared to the Brock model. AUCs with 95% CIs were compared using the DeLong method. Sensitivity and specificity were determined at various thresholds, using a 10% threshold for the Brock model as reference. The model's calibration was visually assessed. The dataset included 49 malignant and 359 benign solid or part-solid nodules, and the size-matched dataset included 47 malignant and 47 benign solid nodules. In the complete dataset, AUCs [95% CI] were 0.89 [0.85, 0.93] for the DL model and 0.86 [0.81, 0.92] for the Brock model (p = 0.27). In the size-matched subset, AUCs of the DL and Brock models were 0.78 [0.69, 0.88] and 0.58 [0.46, 0.69] (p < 0.01), respectively. At a 10% threshold, the Brock model had a sensitivity of 0.49 [0.35, 0.63] and a specificity of 0.92 [0.89, 0.94]. At a threshold of 17%, the DL model matched the specificity of the Brock model at the 10% threshold, but had a higher sensitivity (0.57 [0.43, 0.71]). Calibration analysis revealed that the DL model overestimated the malignancy probability. The DL model demonstrated good discriminatory performance in a dataset of incidental nodules and outperformed the Brock model, but may need recalibration for clinical practice. Question What is the performance of a DL model for pulmonary nodule malignancy risk estimation developed on screening data in a dataset of incidentally detected nodules? Findings The DL model performed well on a dataset of nodules from clinical routine care and outperformed the Brock model in a size-matched subset. Clinical relevance This study provides further evidence about the potential of DL models for risk stratification of incidental nodules, which may improve nodule management in routine clinical practice.

Topics

Journal Article

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.