Back to all papers

Airway segmentation on CT - A systematic review of machine learning tools.

May 30, 2026pubmed logopapers

Authors

Lsloum N,Maiter A,Alnasser T,Albylwi A,Alghamdi K,Sharkey M,Hokmabadi A,Salehi M,Dwivedi K,Johns C,Swift AJ,Alabed S

Affiliations (6)

  • School of Medicine and Population Health, The University of Sheffield, Sheffield, United Kingdom.
  • Department of Diagnostic Radiology, College of Applied Medical Sciences, Najran University, Najran, Saudi Arabia.
  • Department of Clinical Radiology, Sheffield Teaching Hospitals, Sheffield, United Kingdom.
  • National Institute for Health and Care Research, Sheffield Biomedical Research Centre, Sheffield, United Kingdom.
  • Radiological Sciences Department, College of Applied Medical Sciences, King Saud bin Abdulaziz University for Health Sciences, Riyadh, Saudi Arabia.
  • Insigneo Institute, Faculty of Engineering, The University of Sheffield, Sheffield, United Kingdom.

Abstract

Airway assessment on computed tomography (CT) can yield clinically useful information for diagnosis, treatment planning and monitoring in respiratory diseases. Manual airway segmentation is time-consuming, prone to error and poorly reproducible. This systematic review aimed to appraise machine learning (ML) methods for fully automated airway segmentation in chest CT imaging. EMBASE, MEDLINE and CENTRAL were searched on October 28, 2025, for studies which used fully automated ML methods for airway segmentation on CT and reported quantitative performance metrics. The quality of included studies was assessed by the Must AI Criteria-10 (MAIC-10) checklist. PROSPERO [CRD42025635504]. Thirty-two studies (28 used deep learning (DL)) published between 2010 and 2025 were included. Airway segmentation was performed on non-contrast CT scans in most studies. Voxel-wise accuracy metrics were generally high with Dice similarity coefficient (DSC) values ranging between 83% and 96%. Airway-specific topological metrics: branch detection rate (BD) and tree length detection rate (TD) showed broader variability (60-95% and 54-95% respectively), with DL methods consistently outperforming classical ML approaches. Fifteen studies conducted external validation (EXACT'09 test set used in 9/15). MAIC-10 was moderate and ranged from 6 to 8 out of 10, with lowest reporting in safety/privacy (31%), explainability (31%) and transparency (53%). ML models achieved strong airway segmentation accuracy but showed considerable variation in topological completeness. Standardised evaluation frameworks and the adoption of more diverse datasets are needed to strengthen model generalisability and support translation into clinical practice.

Topics

Journal Article

Ready to Sharpen Your Edge?

Subscribe to join 11k+ peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.