Back to all papers

Bosniak classification of renal cysts using large language models: a comparative study.

August 24, 2025pubmed logopapers

Authors

Hacibey I,Kaba E

Affiliations (2)

  • Department of Urology, Basaksehir Çam and Sakura City Hospital, Istanbul, Turkey.
  • Department of Radiology, Recep Tayyip Erdogan University, Rize, Turkey.

Abstract

The Bosniak classification system is widely used to assess malignancy risk in renal cystic lesions, yet inter-observer variability poses significant challenges. Large language models (LLMs) may offer a standardized approach to classification when provided with textual descriptions, such as those found in radiology reports. This study evaluated the performance of five LLMs-GPT‑4 (ChatGPT), Gemini, Copilot, Perplexity, and NotebookLM-in classifying renal cysts based on synthetic textual descriptions mimicking CT report content. A synthetic dataset of 100 diagnostic scenarios (20 cases per Bosniak category) was constructed using established radiological criteria. Each LLM was evaluated using zero-shot and few-shot prompting strategies, while NotebookLM employed retrieval-augmented generation (RAG). Performance metrics included accuracy, sensitivity, and specificity. Statistical significance was assessed using McNemar's and chi-squared tests. GPT‑4 achieved the highest accuracy (87% zero-shot, 99% few-shot), followed by Copilot (81-86%), Gemini (55-69%), and Perplexity (43-69%). NotebookLM, tested only under RAG conditions, reached 87% accuracy. Few-shot learning significantly improved performance (p < 0.05). Classification of Bosniak IIF lesions remained challenging across models. When provided with well-structured textual descriptions, LLMs can accurately classify renal cysts. Few-shot prompting significantly enhances performance. However, persistent difficulties in classifying borderline lesions such as Bosniak IIF highlight the need for further refinement and real-world validation.

Topics

Journal Article

Ready to Sharpen Your Edge?

Subscribe to join 7,100+ peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.