Automated assessment of Knosp grade from pituitary adenoma MRI: Experimental comparison of a rule-based and a machine learning-based approach.
Authors
Affiliations (6)
Affiliations (6)
- Department of Neurosurgery and Neurooncology, Military University Hospital, Prague, Czech Republic.
- 1st Faculty of Medicine, Charles University, Prague, Czech Republic.
- Department of Cybernetics, Faculty of Electrical Engineering, Czech Technical University, Prague, Czech Republic.
- Department of Radiodiagnostics, Military University Hospital, Prague, Czech Republic.
- Department of Internal Medicine, Military University Hospital, Prague, Czech Republic.
- Third Department of Internal Medicine, First Faculty of Medicine, Charles University and General University Hospital, Prague, Czech Republic.
Abstract
The Knosp grading system is widely used to characterize parasellar extension of pituitary adenomas and to stratify the risk of cavernous sinus (CS) invasion, gross total resection (GTR), and endocrinological remission (ER). However, its assessment relies on expert interpretation of MRI and shows limited inter-rater reliability. To develop and compare two automated approaches for Knosp grade assessment from preoperative MRI-one rule-based method emulating the original geometric algorithm and one statistical deep learning-based method-and to evaluate their accuracy and ability to stratify CS invasion, GTR, and ER. A geometry-based algorithm was implemented using tumor and internal carotid artery segmentations, generated either manually or automatically. In parallel, a deep learning classifier was trained on 394 annotated MRI scans. Both methods were evaluated on an independent validation cohort of 99 scans. Two additional expert raters independently assigned Knosp grades to assess human inter-rater reliability. Human raters achieved accuracies of 64.65% (κ = 0.538) and 60.10% (κ = 0.463). The geometry-based method reached 44.95% accuracy (κ = 0.270) with manual segmentations and 35.35% (κ = 0.164) with automatic segmentations, while the deep learning estimator achieved 41.92% (κ = 0.234). Higher Knosp grades assigned by automated methods were significantly associated with increased CS invasion risk and reduced likelihood of GTR (p < 0.05). Automated approaches can support Knosp grade assessment, but their current accuracy is insufficient for standalone clinical use.