Assessing the Performance and Reliability of Deep Learning Auto-Segmentation in Videofluoroscopic Swallowing Studies: A Systematic Review and Meta-Analysis.

March 27, 2026

papers

DOI: 10.1016/j.apmr.2026.03.016 PMID: 41905601

Authors

Chuang WK,Lin BF,Lee YH,Su PH,Kao YS,Lu CF

Affiliations (6)

Department of Radiation Oncology, Shuang Ho Hospital, Taipei Medical University, New Taipei City 235, Taiwan; Department of Biomedical Imaging and Radiological Sciences, National Yang Ming Chiao Tung University, Taipei 112, Taiwan; Department of Radiation Oncology, Saint Paul's Hospital, Taoyuan 330, Taiwan.
Department of Biomedical Imaging and Radiological Science, China Medical University, Taichung 404, Taiwan.
Department of Physical Medicine and Rehabilitation, Taipei Medical University-Hsin Kuo Min Hospital, Taoyuan City 320, Taiwan; Department of Physical Medicine and Rehabilitation, School of Medicine, College of Medicine, Taipei Medical University 110, Taipei City, Taiwan; Graduate Institute of Sports Science, College of Exercise and Health Sciences, National Taiwan Sport University, Taoyuan City, Taiwan.
Department of Biomedical Imaging and Radiological Sciences, National Yang Ming Chiao Tung University, Taipei 112, Taiwan.
Department of Radiation Oncology, Taoyuan General Hospital, Ministry of Health and Welfare, Taoyuan 330, Taiwan. Electronic address: [email protected].
Department of Biomedical Imaging and Radiological Sciences, National Yang Ming Chiao Tung University, Taipei 112, Taiwan. Electronic address: [email protected].

Abstract

To systematically evaluate the accuracy and reliability of deep learning-based auto-segmentation methods in videofluoroscopic swallowing studies (VFSS) through meta-analysis. A comprehensive literature search was conducted across PubMed, IEEE Xplore, Embase, Web of Science, and Cochrane Library databases for studies published in English between 2013 and 2024. Studies were included if they applied deep learning techniques to the auto-segmentation of anatomical structures in VFSS, specifically the bolus, cervical spine, hyoid bone, or thyroid cartilage-vocal fold complex (TVC) and reported quantitative performance metrics such as the Dice similarity coefficient. Two independent reviewers extracted data on study characteristics, segmentation targets, deep learning model types, and performance metrics. Methodological quality was assessed using the CLAIM and QUADAS-2 tools. Ten studies met inclusion criteria. A random-effects meta-analysis yielded an overall pooled Dice score of 0.83 (95% CI: 0.76-0.88, I² = 77%). Subgroup analyses showed similar performance for bolus segmentation (pooled Dice score = 0.84; 95% CI: 0.70-0.92, I² = 74%) and cervical spine segmentation (pooled Dice score = 0.83; 95% CI: 0.69-0.91, I² = 87%). Despite high accuracy, substantial heterogeneity was observed. Deep learning-based auto-segmentation in VFSS demonstrates promising accuracy across different anatomical targets. However, methodological variability among studies underscores the need for standardized protocols, multi-center datasets, and comparative evaluations of model architectures to enhance generalizability and clinical utility. PROSPERO registration: CRD42024578117.

View Source Full Text PDF

Topics

Journal ArticleReview

Assessing the Performance and Reliability of Deep Learning Auto-Segmentation in Videofluoroscopic Swallowing Studies: A Systematic Review and Meta-Analysis.

Authors

Affiliations (6)

Abstract

Tags

Topics

Ready to Sharpen Your Edge?