Back to all papers

A hybrid swin transformer-BiLSTM framework and ensemble learning for multimodal brain stroke detection and risk prediction.

February 3, 2026pubmed logopapers

Authors

Ahmed MM,Hossain MM,Rakib MRH,Hashan R,Nirob MTH,Islam MK

Affiliations (6)

  • Department of Biomedical Engineering, Islamic University, Kushtia, 7003, Bangladesh; Bio-Imaging Research Laboratory, Islamic University, Kushtia, 7003, Bangladesh. Electronic address: [email protected].
  • Department of Biomedical Engineering, Islamic University, Kushtia, 7003, Bangladesh; Bio-Imaging Research Laboratory, Islamic University, Kushtia, 7003, Bangladesh. Electronic address: [email protected].
  • Department of Biomedical Engineering, Islamic University, Kushtia, 7003, Bangladesh; Bio-Imaging Research Laboratory, Islamic University, Kushtia, 7003, Bangladesh. Electronic address: [email protected].
  • Department of Electrical and Electronic Engineering, Islamic University, Kushtia, 7003, Bangladesh. Electronic address: [email protected].
  • Department of Electrical and Electronic Engineering, Islamic University, Kushtia, 7003, Bangladesh. Electronic address: [email protected].
  • Department of Biomedical Engineering, Islamic University, Kushtia, 7003, Bangladesh; Bio-Imaging Research Laboratory, Islamic University, Kushtia, 7003, Bangladesh. Electronic address: [email protected].

Abstract

Stroke is one of the leading causes of mortality and long-term disability worldwide, primarily resulting from the sudden disruption of cerebral blood flow. Early and accurate diagnosis plays a crucial role in minimizing neurological damage and improving recovery outcomes. This study proposes a comprehensive multimodal framework integrating a hybrid Swin Transformer-Bidirectional Long Short-Term Memory (SwinT-BiLSTM) model and an ensemble learning-based classifier for automated stroke detection and risk prediction from medical image and tabular clinical data. This study utilizes two brain stroke Computed Tomography (CT) datasets, including a primary dataset named BrSCTHD-2025, collected from hospitals in Dhaka and Faridpur, Bangladesh, and a secondary Kaggle CT dataset. In addition, a primary clinical tabular dataset was collected from Kushtia Medical College Hospital for multimodal analysis. The proposed SwinT-BiLSTM model efficiently extracts global spatial and sequential dependencies from CT images, while the ensemble classifier predicts stroke risk based on clinical and lifestyle parameters. Experimental results demonstrate that the model achieves 98% accuracy with an AUC of 1.00 on the BrSCTHD-2025 dataset and 97% accuracy with an AUC of 0.99 on the secondary Kaggle dataset, outperforming standalone SwinT by 2.5% and Convolutional Neural Network (CNN) architectures such as VGG16 and ResNet50 by 3%-4%. The ensemble classifier trained on tabular data achieved 80.36% accuracy, identifying critical stroke risk factors such as heart disease, prolonged sitting duration, and cholesterol level. Furthermore, Explainable Artificial Intelligence (XAI) techniques such as LIME, SHAP, enhanced Grad-CAM, and attention maps enhance interpretability by identifying the most influential visual and clinical features. Overall, the proposed SwinT-BiLSTM-Ensemble framework establishes a robust foundation for accurate, interpretable, and clinically reliable stroke diagnosis and personalized risk assessment in real-world healthcare environments.

Topics

Journal Article

Ready to Sharpen Your Edge?

Subscribe to join 9,400+ peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.