Back to all papers

Automated periapical lesion segmentation and area-based PAI indexing: a comparative deep learning study on periapical radiographs.

May 18, 2026pubmed logopapers

Authors

Erkal D,Çakmak YE,Er K,Kuştarcı A

Affiliations (3)

  • Department of Endodontics, Faculty of Dentistry, Burdur Mehmet Akif Ersoy University, Burdur, Turkey.
  • Department of Endodontics, Faculty of Dentistry, Akdeniz University, Antalya, Turkey. [email protected].
  • Department of Endodontics, Faculty of Dentistry, Akdeniz University, Antalya, Turkey.

Abstract

Artificial intelligence can standardize periapical assessment, but few studies map pixel-level segmentation to clinically interpretable Periapical Index (PAI) categories under a unified protocol. We retrospectively assembled 900 anonymized periapical radiographs with expert masks and patient-level splits (train/val/test: 594/145/161). Four architectures (U-Net, ResUNet34, DeepLabV3, HRNet) were trained with identical preprocessing, augmentation, and binary cross-entropy (BCE) + Dice loss. On the independent test set, segmentation and image-level detection (sensitivity, specificity, precision, F1, AUC) were computed at a prespecified operating point. An area-based PAI score (aPAI) was derived from lesion-to-image area ratios using prespecified thresholds, providing a quantitative, size-based proxy for conventional PAI categories that does not incorporate qualitative radiographic features such as border definition or trabecular changes. DeepLabV3 achieved the most balanced detection (accuracy 90.1%, sensitivity 92.8%, F1 91.8%), while HRNet yielded the highest specificity (87.5%) and precision (91.4%). Friedman/Wilcoxon analyses showed significant overall between-model differences. After Bonferroni correction (adjusted α = 0.0083), only the DeepLabV3 versus U-Net comparison remained statistically significant for both binary lesion detection and aPAI classification (both p = 0.0012); no other pairwise differences reached the corrected threshold. aPAI classification accuracy ranged from 72.7% (U-Net) to 84.5% (DeepLabV3). Segmentation-based, area-derived PAI scoring is feasible and consistent across architectures. DeepLabV3 is preferable for screening workflows requiring high sensitivity, whereas HRNet minimizes false positives for confirmatory use. The unified pipeline provides an interpretable bridge from pixel probabilities to standard PAI categories.

Topics

Journal Article

Ready to Sharpen Your Edge?

Subscribe to join 11k+ peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.