A machine learning-based predictive model for mandibular third molar extraction difficulty: incorporating multimodal features and SHAP analysis.
Authors
Affiliations (3)
Affiliations (3)
- Shanghai Engineering Research Center of Tooth Restoration and Regeneration & Tongji Research Institute of Stomatology & Department of Oral and Maxillofacial Surgery, Shanghai Tongji Stomatological Hospital and Dental School, Tongji University, 399 Yanchang Middle Road, Jing'an District, Shanghai, Asia, 200092, China.
- School of Aerospace Engineering and Applied Mechanics, Tongji University, Shanghai, Asia, 200092, China.
- Shanghai Engineering Research Center of Tooth Restoration and Regeneration & Tongji Research Institute of Stomatology & Department of Oral and Maxillofacial Surgery, Shanghai Tongji Stomatological Hospital and Dental School, Tongji University, 399 Yanchang Middle Road, Jing'an District, Shanghai, Asia, 200092, China. [email protected].
Abstract
This study aimed to establish a rapid and accurate predictive model for mandibular third molar (MM3) extraction difficulty based on machine learning and multimodal parameters. A dataset was constructed by integrating clinical characteristics with morphological features automatically extracted from cone-beam computed tomography (CBCT) images. Extraction difficulty was determined by three experienced experts using a ten-factor scoring system and clinical judgment. Six machine learning (ML) models were developed: support vector machine (SVM), artificial neural network (ANN), extreme gradient boosting (XGBoost), random forest (RF), k-nearest neighbors (KNN), and logistic regression. Model performance was optimized using grid search and five-fold cross-validation. SHapley Additive exPlanations (SHAP) were used to interpret feature importance, and recursive feature elimination (RFE) was employed for validation. The ML models predicted extraction difficulty efficiently, with XGBoost achieving the highest accuracy (88.24%), outperforming junior clinicians (83.53%). SHAP and RFE analyses highlighted the dominant role of morphological features, especially the angulation between adjacent teeth, contact area, and volume of the MM3. Clinical features such as fibrinogen and prothrombin time also contributed to prediction. The ML models demonstrated high accuracy and efficiency. Integrating morphological and clinical features significantly improves prediction performance. Adjacent tooth resistance was the most influential factor, followed by bone resistance and mandibular canal-related features.