Back to all papers

SpineScan: a deep learning model for lumbar spine MRI annotation and Pfirrmann grading assessment.

November 3, 2025pubmed logopapers

Authors

Minin A,Leonova O,Krutko A,Elgaeva E,Antonets D,Shtokalo D,Tsepilov Y

Affiliations (6)

  • Institute for Artificial Intelligence, Lomonosov Moscow State University, Moscow, Russian Federation.
  • Central Scientific Research Institute of Traumatology and Orthopedics, Moscow, Russian Federation. [email protected].
  • Central Scientific Research Institute of Traumatology and Orthopedics, Moscow, Russian Federation.
  • Institute of Cytology and Genetics, Novosibirsk, Russian Federation.
  • Novosibirsk State University, Novosibirsk, Russian Federation.
  • Wellcome Sanger Institute, Cambridge, UK.

Abstract

While recent advances in deep learning have enabled automated Pfirrmann grading systems for intervertebral disc degeneration (IDD), many models remain inaccessible due to proprietary restrictions. This study aimed to develop and validate a convolutional neural network (CNN) for automated Pfirrmann grading using a diverse clinical dataset, to compare our model's performance with previously published results, and to create an open-source web application with a graphical user interface capable of grading both DICOM studies and individual MRI slices provided as image files. We trained a CNN-based model using the YOLOv8x architecture on two datasets: a well-curated Russian disc degeneration study (RuDDS) cohort and an open-access dataset, totaling 484 lumbar MRI scans. Ground truth grading was provided by expert radiologists. The model was designed to simultaneously detect intervertebral discs and classify degeneration grades from single MRI slices. Performance was evaluated using standard metrics, including precision, recall, and mean average precision (mAP) across Pfirrmann grades I to V. Our model achieved predictive accuracy between 0.78 and 0.82 depending on lumbar level. The highest performance was observed for Grade IV discs (mAP50 = 0.872), while performance for Grade V was lower (mAP50-95 = 0.525), likely due to poor contrast and indistinct boundaries in highly degenerated discs. Overall, the model demonstrated a precision of 0.75 and a recall of 0.808. Comparison with previous studies revealed that our results are consistent with expert-level performance. The developed model formed the basis of a specialized web application, SpineScan, implemented using the Streamlit framework. The developed model shows strong potential for automated grading of lumbar disc degeneration and performs comparably to expert radiologists in most cases. Our findings support the potential applicability of SpineScan for AI-assisted Pfirrmann grading.

Topics

Journal Article

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.