Revolutionizing medical imaging: A cutting-edge AI framework with vision transformers and perceiver IO for multi-disease diagnosis.

Authors

Khaliq A,Ahmad F,Rehman HU,Alanazi SA,Haleem H,Junaid K,Andrikopoulou E

Affiliations (5)

  • Center of Data Science, Government College University Faisalabad, Kotwali Road, Faisalabad, Punjab 37300, Pakistan.
  • School of Computing, Faculty of Technology, University of Portsmouth, Winston Churchill Ave, Southsea, Portsmouth PO1 3HE, United Kingdom. Electronic address: [email protected].
  • Department of Computer Science, College of Computer and Information Sciences, Jouf University, Sakaka, Aljouf 72341, Saudi Arabia.
  • School of Biological and Behavioural Sciences, Queen Mary University of London, London E1 4NS, United Kingdom.
  • School of Computing, Faculty of Technology, University of Portsmouth, Winston Churchill Ave, Southsea, Portsmouth PO1 3HE, United Kingdom; Portsmouth Artificial Intelligence and Data Science Centre (PAIDS), University of Portsmouth, Portsmouth PO1 3HE, United Kingdom.

Abstract

The integration of artificial intelligence in medical image classification has significantly advanced disease detection. However, traditional deep learning models face persistent challenges, including poor generalizability, high false-positive rates, and difficulties in distinguishing overlapping anatomical features, limiting their clinical utility. To address these limitations, this study proposes a hybrid framework combining Vision Transformers (ViT) and Perceiver IO, designed to enhance multi-disease classification accuracy. Vision Transformers leverage self-attention mechanisms to capture global dependencies in medical images, while Perceiver IO optimizes feature extraction for computational efficiency and precision. The framework is evaluated across three critical clinical domains: neurological disorders, including Stroke (tested on the Brain Stroke Prediction CT Scan Image Dataset) and Alzheimer's (analyzed via the Best Alzheimer MRI Dataset); skin diseases, covering Tinea (trained on the Skin Diseases Dataset) and Melanoma (augmented with dermoscopic images from the HAM10000/HAM10k dataset); and lung diseases, focusing on Lung Cancer (using the Lung Cancer Image Dataset) and Pneumonia (evaluated with the Pneumonia Dataset containing bacterial, viral, and normal X-ray cases). For neurological disorders, the model achieved 0.99 accuracy, 0.99 precision, 1.00 recall, 0.99 F1-score, demonstrating robust detection of structural brain abnormalities. In skin disease classification, it attained 0.95 accuracy, 0.93 precision, 0.97 recall, 0.95 F1-score, highlighting its ability to differentiate fine-grained textural patterns in lesions. For lung diseases, the framework achieved 0.98 accuracy, 0.97 precision, 1.00 recall, 0.98 F1-score, confirming its efficacy in identifying respiratory conditions. To bridge research and clinical practice, an AI-powered chatbot was developed for real-time analysis, enabling users to upload MRI, X-ray, or skin images for automated diagnosis with confidence scores and interpretable insights. This work represents the first application of ViT and Perceiver IO for these disease categories, outperforming conventional architectures in accuracy, computational efficiency, and clinical interpretability. The framework holds significant potential for early disease detection in healthcare settings, reducing diagnostic errors, and improving treatment outcomes for clinicians, radiologists, and patients. By addressing critical limitations of traditional models, such as overlapping feature confusion and false positives, this research advances the deployment of reliable AI tools in neurology, dermatology, and pulmonology.

Topics

Journal Article

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.