Automated measurement of hallux valgus angles using a deep learning model: validation and comparison with surgeons of varying expertise.
Authors
Affiliations (5)
Affiliations (5)
- Foot and Ankle Surgery Department, Honghui Hospital, Xi'an Jiaotong University, No. 76 Nanguo Road, Xi'an, 710054, People's Republic of China.
- Diabetic Foot MDT Center, Honghui Hospital, Xi'an Jiaotong University, Xi'an, 710054, People's Republic of China.
- Xi'an Medical University, No. 1, Xinwang Road, Weiyang District, Xi'an, 710021, People's Republic of China.
- Foot and Ankle Surgery Department, Honghui Hospital, Xi'an Jiaotong University, No. 76 Nanguo Road, Xi'an, 710054, People's Republic of China. [email protected].
- Diabetic Foot MDT Center, Honghui Hospital, Xi'an Jiaotong University, Xi'an, 710054, People's Republic of China. [email protected].
Abstract
Hallux valgus (HV) assessment commonly relies on measuring the hallux valgus angle (HVA) and intermetatarsal angle (IMA) on weight-bearing radiographs. Manual measurement is time-consuming and subject to intra- and inter-observer variability. Deep learning (DL) methods may automate angle measurement with high accuracy, but prior studies often lacked control groups and comparisons across surgeon experience levels. We developed a DL model based on UNet++ with a MobileNetV2 backbone to segment bone axes, followed by a linear regression module to predict HVA and IMA. The dataset comprised 2,468 weight-bearing radiographs (1,920 HV; 548 control) from 2,124 participants. Performance metrics included mean absolute error (MAE), root mean squared error (RMSE), intraclass correlation coefficient (ICC), and outlier rates; model results were compared with measurements from two foot and ankle surgeons of differing experience. In the HV group, the model achieved ICCs of 0.902 for HVA and 0.887 for IMA; in the control group, ICCs were 0.821 (HVA) and 0.898 (IMA). For IMA, the model demonstrated lower MAE and fewer outliers than the surgeons (HV outliers 4.6% vs. 6.6-7.9%; control 4.8% vs. 6.0-9.5%). Average measurement time was 0.13 s per radiograph for the model versus 1.8-2.6 min manually. The proposed DL model automates HVA and IMA measurement with accuracy and reliability comparable to experienced surgeons while markedly improving efficiency. The model also showed consistent performance across HV and control radiographs in our cohort, supporting potential clinical application pending external validation.