Back to all papers

An uncertainty-aware vision transformer-BiLSTM Bayesian framework for reliable clinical decision support using chest X-rays.

June 16, 2026pubmed logopapers

Authors

Al-Duais FS,Almenwer S,Alanazi A,Elebeed RAE

Affiliations (4)

  • Department of Mathematics, College of Science and Humanities in Al-Kharj, Prince Sattam Bin Abdulaziz University, Al-Kharj, 11942, Saudi Arabia. [email protected].
  • Department of Computer Science, College of Computer and Information Sciences, Jouf University, Sakaka, Saudi Arabia.
  • Department of Information System, College of Computer and Information Sciences, Jouf University, Sakaka, Saudi Arabia.
  • Department of Computer Science, College of Computer Engineering and Sciences, Prince Sattam bin Abdulaziz University, Al-Kharj, 11942, Saudi Arabia.

Abstract

Despite recent progress in deep leaning for medical image analysis, there are still issues of reliability, interpretability, and uncertainty estimation that must be addressed for the safe adoption of deep learning into clinical practice. Conventional CNN- based models have achieved excellent accuracy but have also typically shown prediction confidence levels that are too extreme, limiting confidence in very high-stakes diagnostic environments. To tar- get these deficiencies, we propose a suitable competence-based uncertainty-aware infrastructure that combines Vision Transformers, BiLSTM modeling, followed by a Bayesian fusion layer to permit high reliability in disease detection through the use of chest X-rays. The recently proposed Vision Transformer - BiLSTM Bayesian Fusion (ViT - BiLSTM - BF) architecture is trained and eval- uated on two large-scale datasets, the MIMIC-CXR-JPG and PadChest-GR datasets, looking into Pneumonia, Pulmonary Fibrosis, and Pleural Effusion in study depth. Normalization, U-Net (Convo- lutional Neural Network)-based lung segmentation and extensive data augmentation were completed as preprocessing steps. The ViT encoder extracts global spatial features while contextual (temporal) information is maintained by the BiLSTM across the patch embeddings. A Bayesian fusion module that integrates predictive distributions to quantify a priori and a posteriori aleatoric uncertainty and epistemic uncertainty was developed. Performance metrics of accuracy, AUC-ROC, F1 score, Brier Score, ECE and statistical results were used to assess performance. The developed model displayed superior performance over other models achieving accuracy of 95.5% and AUC-ROC of 0.986 and calibration error was extremely small (ECE = 0.028). The Bayesian fusion layer developed produced well calibrated measures of confidence that would be needed to distinguish reliably between high confidence and high risk predictions. In all comparative analyses improvements over others in CNN, DenseNet, Bayesian CNN and the individual ViT alone disgracefully performed. The results of this study demonstrate that by virtue of combining transformer based spatial modelling, temporal feature understanding, and probabilistic fusion, there is a highly reliable and clinically meaningful diagnostic system realised. The framework ultimately offers a real form of solution to the critical issues regarding uncertainty estimation and model reliability and offers potentially interesting possibilities for feasible implementation within at least the development of clinical decision support systems.

Topics

Journal Article

Ready to Sharpen Your Edge?

Subscribe to join 11k+ peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.