Back to all papers

Assessment of a Grad-CAM interpretable deep learning model for HAPE diagnosis: performance and pitfalls in severity stratification from chest radiographs.

October 30, 2025pubmed logopapers

Authors

Yang Y,Yu H,Xiang Q,Wu J,Li J,Du F,Yang Y,Wang P

Affiliations (4)

  • Department of Radiology, Chinese People's Liberation Army The General Hospital of Western Theater Command, No. 270, Tianhui Road, Rongdu Avenue, Jinniu District, Chengdu, Sichuan Province, 610083, China.
  • Department of Radiology, The 950th Army Hospital of the Chinese People's Liberation Army, Yecheng County, Kashgar City, Xinjiang Province, 844900, China.
  • Department of High-Altitude Medicine, The 950th Army Hospital of the Chinese People's Liberation Army, Yecheng County, Kashgar City, Xinjiang Province, 844900, China.
  • Department of Radiology, Chinese People's Liberation Army The General Hospital of Western Theater Command, No. 270, Tianhui Road, Rongdu Avenue, Jinniu District, Chengdu, Sichuan Province, 610083, China. [email protected].

Abstract

To investigate the feasibility of a deep learning model, using a transfer learning approach, for recognizing high-altitude pulmonary edema (HAPE) on chest X-ray images and exploring its capability for assessing severity. Retrospective study. This retrospective study utilized a multi-source dataset. The pretraining set was derived from the ARXIV_V5_CHESTXRAY database (3,923 images, including 2,303 with edema labels). The primary HAPE-specific training set comprised radiographs from the 950th Hospital of the Chinese People's Liberation Army (1,003 HAPE cases and 702 normal controls; 2007-2023). An external validation set was constructed from recent cases (Jan-Dec 2023) from two hospitals (679 HAPE cases and 436 normal controls), with strict patient separation. We implemented a multi-stage pipeline: (1) A DeepLabV3_ResNet-50 model was trained for lung segmentation on a subset of the pretraining set; (2) MobileNet_V2 and VGG19 architectures underwent pretraining for general pulmonary edema severity grading on the ARXIV_V5_CHESTXRAY dataset; (3) These models were then fine-tuned on the HAPE-specific training set. The segmentation model achieved a Dice coefficient of 99.03%. The binary classification model (VGG19) for edema detection achieved a validation AUC of 0.950. The multi-class models (MobileNet_V2) achieved macroaverage AUCs of 0.92 (3-class) and 0.89 (4-class). The model demonstrated high performance in distinguishing normal (class 0) and severe edema (class 3) (sensitivities: 0.91, 0.88). However, performance was critically low for intermediate grades (classes 1 and 2; sensitivities: 0.16, 0.37). Transfer learning from general to HAPE-specific edema data produced a model that accurately segments lungs and differentiates severe HAPE from normal cases with high performance. However, its failure to reliably identify intermediate grades underscores the challenges of domain shift and fine-grained radiographic assessment. This work highlights both the promise and pitfalls of using heterogeneous datasets for rare disease diagnosis.

Topics

Deep LearningRadiography, ThoracicRadiographic Image Interpretation, Computer-AssistedPulmonary EdemaJournal Article

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.