Respiratory Disease Detection: A Systematic Review of AI-Based Approaches, from Audio and Visual Unimodal Methods to Multimodal Integration.
Authors
Affiliations (2)
Affiliations (2)
- Informatics and Computer Systems Department, King Khalid University, Abha 61421, Saudi Arabia.
- Department of Chemical Engineering, King Khalid University, Abha 61421, Saudi Arabia.
Abstract
<b>Background:</b> Respiratory diseases (RDs), including asthma, COVID-19, chronic obstructive pulmonary disease (COPD), and pneumonia, remain a major global health challenge, contributing substantially to global morbidity and mortality. Conventional diagnosis relies heavily on clinicians' expertise to interpret respiratory sounds and radiographic images, a process that can be subjective, time-consuming, and prone to inter-observer variability. Recent advances in artificial intelligence (AI) and machine learning (ML) have enabled automated diagnostic approaches that can improve the efficiency, consistency, and scalability of respiratory disease detection. However, existing research remains fragmented across different data modalities. <b>Methods:</b> This review systematically analyzes recent studies on AI-based respiratory disease detection using both visual modalities (e.g., chest X-rays, computed tomography (CT) scans, and ultrasound) and audio modalities (e.g., cough and breath sounds). To provide a comprehensive perspective, the reviewed literature is organized using a unified taxonomy that categorizes existing approaches into three main groups: audio-based, visual-based, and audio-visual-based methods. In addition, two conceptual frameworks are proposed to illustrate representative pipelines for audio-based and visual-based respiratory disease classification. <b>Results:</b> The analysis reveals that most existing studies focus on single-modality approaches, while multimodal integration remains relatively underexplored. Only a limited number of studies combine audio and visual data within unified frameworks, primarily due to the scarcity of synchronized multimodal datasets collected from the same patients. The proposed taxonomy and conceptual frameworks provide a structured basis for comparing existing methods, identifying methodological trends, and highlighting key research gaps in multimodal respiratory disease detection. <b>Conclusions:</b> Future research should prioritize the development of multimodal datasets, robust evaluation protocols, and interpretable and lightweight AI models suitable for real-world clinical deployment. Advancing multimodal integration has the potential to significantly enhance the accuracy, reliability, and clinical applicability of AI-driven respiratory disease diagnosis systems.