Applications of artificial intelligence algorithms in ultrasound-based kidney stone detection, classification, prediction, and management: a systematic review.
Authors
Affiliations (4)
Affiliations (4)
- Isfahan University of Medical Sciences, Isfahan, Iran, Islamic Republic of. [email protected].
- Ahvaz Jundishapur University of Medical Sciences, Ahvaz, Iran, Islamic Republic of. [email protected].
- Collage of Applied Medical Sciences, KSU, Riyadh, Saudi Arabia.
- Jundishapur University of Medical Sciences, Ahvāz, Iran, Islamic Republic of.
Abstract
Kidney stones are a prevalent urological condition with significant global burden, often diagnosed using ultrasound (US) as a first-line modality despite its limitations in sensitivity and operator dependency. Artificial intelligence (AI) and deep learning (DL) algorithms have shown promise in enhancing US-based kidney stone applications, including detection, classification, complication prediction, and procedural guidance, but evidence remains heterogeneous. To systematically review and synthesize the applications of AI and DL algorithms in US-based kidney stone detection, classification, prediction of complications/outcomes, and procedural guidance. This systematic review followed PRISMA guidelines (PROSPERO: CRD420251247650). Databases including PubMed, Embase, Scopus, and others were searched from inception without language restrictions. Eligible studies were original peer-reviewed articles evaluating AI/DL in US for kidney stone diagnostics against reference standards like CT or surgical findings. Two reviewers independently screened, extracted data, and assessed quality using QUADAS-2 with AI extensions. From 1,285 records, 9 studies were included after exclusions. These encompassed DL for image detection/segmentation (n = 3), predictive modeling for complications/outcomes (n = 4), and procedural guidance (n = 2). Methodologies included CNN variants and ML ensembles. Performance metrics were high, with accuracies up to 96.54%, AUCs > 0.90 for predictions, and improved procedural outcomes. Risk of bias was low in most studies (5/9), with some concerns in others. Heterogeneity in datasets and validation limited meta-analysis. AI and DL algorithms demonstrate high diagnostic accuracy and clinical utility in enhancing US for kidney stone management, with stratification by application type revealing high performance across tasks, addressing traditional limitations. However, methodological variability and low to very low certainty of evidence (per GRADE) necessitate standardized external validation and multimodal integration for broader adoption.