Back to all papers

LightGastroFormer: a lightweight multi-resolution transformer for gastrointestinal disease classification.

June 9, 2026pubmed logopapers

Authors

Singh P,Singh S,Shukla MK

Affiliations (3)

  • iHub-Data, International Institute of Information Technology Hyderabad, Professor C. R. Rao Road, Gachibowli, Hyderabad, Telangana, 500032, India.
  • Department of Interdisciplinary Courses in Engineering (DICE), Chitkara University Institute of Engineering and Technology, Chandigarh-Patiala National Highway, Chandigarh-Patiala National Highway, Punjab, 140401, India.
  • Symbiosis Institute of Technology, Pune Campus, Symbiosis International (Deemed University), Pune, 412115, India. [email protected].

Abstract

In order to promote early diagnosis and lessen the workload of doctors during endoscopic and capsule endoscopy tests, automated analysis of gastrointestinal (GI) images is essential. However, current deep learning techniques frequently fail to simultaneously capture long-range contextual information and fine-grained local patterns, especially in large-scale and highly imbalanced datasets. We suggest LightGastroFormer, a lightweight transformer-based architecture intended for reliable and effective GI disease categorization, as a solution to these problems.The suggested model combines a gated feed-forward network, effective self-attention, and a multi-resolution patchwise tokenizer to efficiently find minor lesion characteristics while preserving computational efficiency. Three public benchmarks are used to evaluate LightGastroFormer: Kvasir v1, Kvasir v2, and the extensive Kvasir-Capsule dataset. Across all evaluated datasets, the proposed approach demonstrates consistently strong performance, matching or surpassing existing state-of-the-art CNN- and transformer-based methods. In particular, without using explicit data balancing strategies, LightGastroFormer obtains an accuracy of 0.94 on Kvasir v1, 0.95 on Kvasir v2, and 0.97 accuracy with a 0.97 F1-score on the highly imbalanced Kvasir-Capsule dataset. Further ablation studies confirm the effectiveness of each architectural component and reveal important insights into model behavior under class imbalance. With only 6.42 million trainable parameters, LightGastroFormer offers a favorable balance between accuracy, robustness, and efficiency, making it well suited for real-world clinical deployment in gastrointestinal disease screening and diagnosis. The code is available here: https://github.com/Prateeksingh-moa/LightGastroformer.

Topics

Journal Article

Ready to Sharpen Your Edge?

Subscribe to join 11k+ peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.