Explainable artificial intelligence for pneumonia classification: Clinical insights into deformable prototypical part network in pediatric chest x-ray images.
Authors
Affiliations (4)
Affiliations (4)
- Department of Medical Physics, School of Medicine, Iran University of Medical Sciences, Tehran, Iran. Electronic address: [email protected].
- Department of Computer and Data Sciences, Faculty of Mathematical Sciences, Shahid Beheshti University, Tehran, Iran. Electronic address: [email protected].
- Research Center for Nuclear Medicine, Tehran University of Medical Sciences, Tehran, Iran. Electronic address: [email protected].
- Department of Computer and Data Sciences, Faculty of Mathematical Sciences, Shahid Beheshti University, Tehran, Iran. Electronic address: [email protected].
Abstract
Pneumonia detection in chest X-rays (CXR) increasingly relies on AI-driven diagnostic systems. However, their "black-box" nature often lacks transparency, underscoring the need for interpretability to improve patient outcomes. This study presents the first application of the Deformable Prototypical Part Network (D-ProtoPNet), an ante-hoc interpretable deep learning (DL) model, for pneumonia classification in pediatric patients' CXR images. Clinical insights were integrated through expert radiologist evaluation of the model's learned prototypes and activated image patches, ensuring that explanations aligned with medically meaningful features. The model was developed and tested on a retrospective dataset of 5,856 CXR images of pediatric patients, ages 1-5 years. The images were originally acquired at a tertiary academic medical center as part of routine clinical care and were publicly hosted on a Kaggle platform. This dataset comprised anterior-posterior images labeled normal, viral, and bacterial. It was divided into 80 % training and 20 % validation splits, and utilised in a supervised five-fold cross-validation. Performance metrics were compared with the original ProtoPNet, utilising ResNet50 as the base model. An experienced radiologist assessed the clinical relevance of the learned prototypes, patch activations, and model explanations. The D-ProtoPNet achieved an accuracy of 86 %, precision of 86 %, recall of 85 %, and AUC of 93 %, marking a 3 % improvement over the original ProtoPNet. While further optimisation is required before clinical use, the radiologist praised D-ProtoPNet's intuitive explanations, highlighting its interpretability and potential to aid clinical decision-making. Prototypical part learning offers a balance between classification performance and explanation quality, but requires improvements to match the accuracy of black-box models. This study underscores the importance of integrating domain expertise during model evaluation to ensure the interpretability of XAI models is grounded in clinically valid insights.