Back to all papers

Open-Source Offline-Deployable Retrieval-Augmented Large Language Model for Assisting Pancreatic Cancer Staging

January 1, 2026medrxiv logopreprint

Authors

Johno, H.,Amakawa, A.,Komaba, A.,Tozuka, R.,Johno, Y.,Sato, J.,Yoshimura, K.,Nakamoto, K.,Ichikawa, S.

Affiliations (1)

  • University of Yamanashi

Abstract

PurposeLarge language models (LLMs) are increasingly applied in radiology, but key challenges remain, including data leakage from cloud-based systems, false outputs, and limited reasoning transparency. This study aimed to develop an open-source, offline-deployable retrieval-augmented LLM (RA-LLM) system in which local execution prevents data leakage and retrieval-augmented generation (RAG) improves output accuracy and transparency using reliable external knowledge (REK), demonstrated in pancreatic cancer staging. Materials and MethodsLlama-3.2 11B and Gemma-3 27B were used as local LLMs, and GPT-4o mini served as a cloud-based comparator. The Japanese pancreatic cancer guideline served as REK. Relevant REK excerpts were retrieved to generate retrieval-augmented responses. System performance, including classification accuracy, retrieval metrics, and execution time, was evaluated on 100 simulated pancreatic cancer CT cases, with non-RAG LLMs as baselines. McNemar tests were applied to TNM staging and resectability classification. ResultsRAG improved TNM staging accuracy for all LLMs (GPT-4o mini 61%[-&gt;]90%, p<0.001; Llama-3.2 11B 53%[-&gt;]72%, p<0.001; Gemma-3 27B 59%[-&gt;]87%, p<0.001) and mildly improved resectability classification (72%[-&gt;]84%, p=0.012; 58%[-&gt;]73%, p=0.006; 77%[-&gt;]86%, p=0.093), with Gemma-3 27B showing performance comparable to GPT-4o mini. Retrieval performance was high (context recall = 1; context precision = 0.5-1), and local models ran at speeds comparable to the cloud-based GPT-4o mini. ConclusionWe developed an offline-deployable RA-LLM system for pancreatic cancer staging and publicly released its full source code. RA-LLMs outperformed baseline LLMs, and the offline-capable Gemma-3 27B performed comparably to the widely used cloud-based GPT-4o mini.

Topics

radiology and imaging

Ready to Sharpen Your Edge?

Subscribe to join 8,000+ peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.