Revolutionizing medical imaging: A cutting-edge AI framework with vision transformers and perceiver IO for multi-disease diagnosis.
Khaliq A, Ahmad F, Rehman HU, Alanazi SA, Haleem H, Junaid K, Andrikopoulou E
•papers•Jul 4 2025The integration of artificial intelligence in medical image classification has significantly advanced disease detection. However, traditional deep learning models face persistent challenges, including poor generalizability, high false-positive rates, and difficulties in distinguishing overlapping anatomical features, limiting their clinical utility. To address these limitations, this study proposes a hybrid framework combining Vision Transformers (ViT) and Perceiver IO, designed to enhance multi-disease classification accuracy. Vision Transformers leverage self-attention mechanisms to capture global dependencies in medical images, while Perceiver IO optimizes feature extraction for computational efficiency and precision. The framework is evaluated across three critical clinical domains: neurological disorders, including Stroke (tested on the Brain Stroke Prediction CT Scan Image Dataset) and Alzheimer's (analyzed via the Best Alzheimer MRI Dataset); skin diseases, covering Tinea (trained on the Skin Diseases Dataset) and Melanoma (augmented with dermoscopic images from the HAM10000/HAM10k dataset); and lung diseases, focusing on Lung Cancer (using the Lung Cancer Image Dataset) and Pneumonia (evaluated with the Pneumonia Dataset containing bacterial, viral, and normal X-ray cases). For neurological disorders, the model achieved 0.99 accuracy, 0.99 precision, 1.00 recall, 0.99 F1-score, demonstrating robust detection of structural brain abnormalities. In skin disease classification, it attained 0.95 accuracy, 0.93 precision, 0.97 recall, 0.95 F1-score, highlighting its ability to differentiate fine-grained textural patterns in lesions. For lung diseases, the framework achieved 0.98 accuracy, 0.97 precision, 1.00 recall, 0.98 F1-score, confirming its efficacy in identifying respiratory conditions. To bridge research and clinical practice, an AI-powered chatbot was developed for real-time analysis, enabling users to upload MRI, X-ray, or skin images for automated diagnosis with confidence scores and interpretable insights. This work represents the first application of ViT and Perceiver IO for these disease categories, outperforming conventional architectures in accuracy, computational efficiency, and clinical interpretability. The framework holds significant potential for early disease detection in healthcare settings, reducing diagnostic errors, and improving treatment outcomes for clinicians, radiologists, and patients. By addressing critical limitations of traditional models, such as overlapping feature confusion and false positives, this research advances the deployment of reliable AI tools in neurology, dermatology, and pulmonology.