Submanifold sparse convolutional networks for automated 3D segmentation of kidneys and kidney tumours in computed tomography.

June 1, 2026

papers

DOI: 10.1038/s41598-026-51801-7 PMID: 42225734

Authors

Alonso-Monsalve S,Whitehead LH,Aurisano A,Escudero Sanchez L

Affiliations (4)

Institute for Particle Physics and Astrophysics, ETH Zürich, Zürich, 8093, Switzerland. [email protected].
Department of Physics, University of Cambridge, Cambridge, CB3 0US, UK.
Department of Physics, University of Cincinnati, Cincinnati, 45221-0011, OH, USA.
Department of Radiology, University of Cambridge, Cambridge, CB2 0QQ, UK. [email protected].

Abstract

Accurate delineation of kidney tumours in Computed Tomography (CT) is essential for downstream quantitative analysis and precision oncology that could enable personalised treatments, but manual segmentation is a specialised task, time-consuming and difficult to scale in routine practice. Automated 3D segmentation remains challenging in medical imaging, where images are large and dense volumes of data, making high-resolution processing with conventional dense convolutional neural networks computationally expensive, and often reliant on downsampling or patch-based inference. To overcome this problem, we propose a two-stage 3D segmentation methodology based on voxel sparsification and submanifold sparse convolutional networks (SSCNs). In Stage 1, a low-resolution sparse network identifies a region of interest (ROI); in Stage 2, a high-resolution sparse network performs refined segmentation within the cropped ROI. This design enables native 3D processing at high resolution while reducing CPU/GPU memory usage and inference time. We evaluate the method on the KiTS23 dataset of renal cancer CT scans using 5-fold cross-validation. Our method achieved Dice similarity coefficients of 95.8% for kidneys + masses, 85.7% for tumours + cysts, and 80.3% for tumours alone, with performance competitive with top KiTS23 approaches. In direct comparisons on the same cross-validation folds, the proposed sparse method achieves tumour + cyst and tumour-only Dice scores comparable to, and slightly higher than, a patch-based nnU-Net baseline, while consistently requiring less VRAM and shorter inference time across the tested hardware. Across the tested GPUs, our sparse model is markedly faster than both nnU-Net and the zero-shot zoom-out/zoom-in foundation model SegVol, which localises kidneys well but underperforms on small heterogeneous lesions. Compared to an equivalent dense implementation of the same architecture, the proposed sparse approach achieves up to a 60% reduction in inference time and up to a 75% reduction in VRAM usage across both CPU and the GPU configurations tested.

View Source Full Text PDF

Topics

Journal Article

Submanifold sparse convolutional networks for automated 3D segmentation of kidneys and kidney tumours in computed tomography.

Authors

Affiliations (4)

Abstract

Tags

Topics

Ready to Sharpen Your Edge?