A comparative analysis of YOLOv8 and nnU-Net v2 based pipelines for sex and age estimation from maxillary sinus morphometry on panoramic radiographs.
Authors
Affiliations (5)
Affiliations (5)
- Department of Basic Medicine Science- Anatomy, Faculty of Dentistry, Ankara University, Türkiye, Ankara, Turkey. [email protected].
- Department of Forensic Anthropology, Graduate School of Health Sciences, Ankara University, Ankara, Türkiye, Turkey.
- Department of Forensic Anthropology, Institute of Forensic Sciences, Ankara University, Ankara, Türkiye, Turkey.
- Department of Oral and Maxillofacial Radiology, Faculty of Dentistry, Near East University, Mersin, Türkiye.
- Near East University, Faculty of Dentistry, Ankara University, Ankara, Türkiye, Turkey.
Abstract
This study aimed to develop and compare two deep learning-based segmentation-radiomics pipelines - YOLOv8-Hybrid and nnU-Net v2 - for automated sex classification and age estimation from maxillary sinus morphometry on panoramic radiographs. A balanced dataset of 1,024 panoramic radiographs (512 males, 512 females; age 18-81 years) was collected from Near East University, North Cyprus. Ground truth sinus annotations were generated by an expert oral radiologist and validated through dual-annotator inter-observer reliability assessment (ICC (2,1) = 0.94-0.97). The YOLOv8-Hybrid pipeline employed YOLOv8n-seg coarse segmentation, U-Net boundary refinement, > 120 morphometric and radiomic features, and CatBoost/XGBoost classifiers. The nnU-Net v2 pipeline used auto-configured 2D U-Net segmentation with identical feature extraction and XGBoost prediction. Both pipelines underwent 5-fold cross-validation with patient-level splitting, transfer learning, Bayesian hyperparameter optimization, and SHAP interpretability analysis. nnU-Net v2 achieved statistically significant superiority in sex classification (AUC = 0.927 [95% CI: 0.881-0.964]) over YOLOv8-CatBoost (AUC = 0.893 [0.841-0.938]; DeLong p = 0.024, Cohen's d = 0.48). Both pipelines demonstrated comparable age estimation performance (MAE ≈ 7.2 years). YOLOv8 showed exceptional consistency (mAP@50 = 98.19%, CV = 0.77%). SHAP analysis identified bilateral area difference as the most determinant feature (sex: 0.42, age: 0.51). External validation on 50 independent images confirmed model generalizability. This study provides the first systematic comparison of YOLOv8 and nnU-Net v2 for forensic maxillary sinus analysis. nnU-Net v2 is recommended for precision-critical forensic reporting, while YOLOv8-Hybrid is suited for high-throughput screening. The > 120 radiomic/morphometric features establish a comprehensive framework for automated biological profiling.