Quantifying central canal stenosis prediction uncertainty in SpineNet with conformal prediction.
Authors
Affiliations (6)
Affiliations (6)
- Department of Health Sciences and Technology (D-HEST), ETH Zurich, Universitätstrasse 2, Zürich, 8092, Switzerland. [email protected].
- Schulthess Clinic, Department of Teaching, Research and Development, Zürich, Switzerland. [email protected].
- Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland. [email protected].
- Department of Health Sciences and Technology (D-HEST), ETH Zurich, Universitätstrasse 2, Zürich, 8092, Switzerland.
- Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland.
- Schulthess Clinic, Department of Teaching, Research and Development, Zürich, Switzerland.
Abstract
This study applies conformal prediction (CP) to SpineNet to quantify prediction uncertainty in the classification of central canal stenosis (CCS) into four grades: normal, mild, moderate, and severe. CP provides prediction sets rather than singleton predictions with a guaranteed probability of containing the true class, offering a transparent way to assess model reliability. This makes CP particularly useful in the medical field, where reliable predictions are critical for decision-making. We analysed 1689 vertebral levels (L1/L2 to L5/S1) from 340 patients who underwent T2-weighted MRI examinations and evaluated four CP methods across multiple significance level. (α). Bootstrap resampling was used to assess robustness across calibration/test splits. Among the evaluated CP methods, the class-conditional CP method consistently achieved the desired coverage while producing the smallest prediction set siss. Top-k produced larger, uninformative prediction sets (~ 4), and Least Ambiguous Set-Valued Classifiers (LAC) and Adaptive Prediction Sets (APS) showed reduced performance, particularly in moderate and severe cases. A class- and level-specific analysis with class-conditional CP (α = 0.15) revealed smaller prediction sets for normal and mild grades (~ 1.5) and larger sets (~ 2.7-3) for less frequent moderate and severe grades, reflecting higher uncertainty in these categories. Overall class-conditional CP emerged as the most reliable and clinically informative approach for estimating uncertainty in CCS grading.