Back to all papers

Gastroenterological disease detection using transformer-based medical imaging for sustainable healthcare.

March 30, 2026pubmed logopapers

Authors

Kehkashan T,Abdelhaq M,Al-Shamayleh AS,Abdullah M,Riaz RA,Sakinah Syed Ahmad S,Ibrahim Abdalla Ahmed A,Akhunzada A

Affiliations (8)

  • Faculty of Computing, Universiti Teknologi Malaysia, 81310, Johor Bahru, Malaysia.
  • Faculty of Information Technology, University of Lahore, Sargodha, 40100, Pakistan.
  • Department of Information Technology, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, 11671, Riyadh, Saudi Arabia.
  • Department of Data Science and Artificial Intelligence, Faculty of Information Technology, Al-Ahliyya Amman University, Amman, 19328, Jordan.
  • Faculty of Information Technology, University of Lahore, Sargodha, 40100, Pakistan. [email protected].
  • Faculty of Artificial Intelligence and Cyber Security, Universiti Teknikal Malaysia Melaka, Durian Tunggal, Malaysia.
  • Computer Science Department, Faculty of Computer Science and Information Technology, Omdurman Islamic University, Omdurman, Sudan. [email protected].
  • Department of Data and Cybersecurity, University of Doha for Science and Technology, Doha, 24449, Qatar.

Abstract

Early detection of gastroenterological diseases significantly improves patient outcomes and reduces late-stage diagnostic burden, yet traditional CNN models show limitations in capturing complex patterns within medical imaging datasets, prompting investigation into transformer architectures like Vision Transformer (ViT). Application of the ViT technology in detecting gastroenterological diseases with the help of medical imaging has not been fully explored, despite the promising capabilities. In this paper, the effectiveness of the ViT-B16 structure for the identification of gastrointestinal abnormalities is considered using a combined dataset of Curated Colon Dataset and HyperKvasir Dataset (10,000 images across four classes), and compared with established methodologies. Our experimental results showed that ViT-B16 performed better when compared to alternative approaches; it achieved 99.5% classification accuracy compared to 99.1% by EfficientNetB5 and 97.1% by EfficientNetB2, with other supportive performance metrics including precision (99.4%), recall (99.4%), and F1-score (99.4%), AUC values ranged from 0.99 to 1.00 across all classes, reflecting very strong discriminatory power regarding disease classification tasks. These suggest that ViT-B16 has great potential for medical diagnosis applications, especially classification tasks in healthcare, where evidence-based decision-making and model interpretability are key considerations. The model also supports sustainable healthcare through computational efficiency and reduced diagnostic burden. However, there are several challenges that have not been addressed, including addressing ethical concerns about diagnostics, improving diagnostic accuracy for underrepresented disease classes, and validating the model across diverse clinical settings, which are essential directions for future research to continue developing gastroenterological disease-detecting techniques.

Topics

Journal Article

Ready to Sharpen Your Edge?

Subscribe to join 11k+ peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.