Personalized CT Protocol Recommendation via Large Language Model: Enabling Fully Automated CT Scanning Workflows.
Authors
Affiliations (9)
Affiliations (9)
- School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China. [email protected].
- Healthcare Advanced Algorithm Department of HSW BU, Shanghai United Imaging Healthcare Co., Ltd, Shanghai, 201800, People's Republic of China. [email protected].
- Healthcare Advanced Algorithm Department of HSW BU, Shanghai United Imaging Healthcare Co., Ltd, Shanghai, 201800, People's Republic of China.
- Department of Radiology, The Affiliated Nanjing Drum Tower Hospital of Nanjing University Medical School, Nanjing, 210008, People's Republic of China.
- Department of Radiology, Jiading District Central Hospital, Shanghai, 201800, People's Republic of China.
- Imaging Center, The Third People's Hospital of Hefei, Third Clinical College of Anhui Medical University, HefeiHefei, 230022, China.
- Department of Radiology, Zhujiang Hospital of Southern Medical University, Guangzhou, 510280, China.
- School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China. [email protected].
- MoE Key Laboratory of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University, Shanghai, 200240, China. [email protected].
Abstract
Manual CT protocol selection persists as a time-intensive and error-prone bottleneck in radiology workflows, impeding the realization of fully automated scanning pipelines. To overcome this limitation, we developed a Large Language Model Retrieval-Augmented Generation (LLM-RAG) framework for personalized CT protocol recommendation. This system constructs a protocol knowledge base from historical examination records to deliver institutionally tailored, precision recommendations aligned with clinical preferences. Our system demonstrated compelling performance (min: 88.60% precision, 89.34% recall, 88.08% F1, 96.09% accuracy), with key findings revealing: (1) task-specific parity between Qwen and DeepSeek models at equivalent scales (max Δ = 1.41% at 32B); (2) positive scaling laws where larger models boost accuracy (e.g., DeepSeek 7B → 32B: + 1.55%); and (3) linear GPU memory-cost scaling (7B:25 GB → 32B:95 GB), defining clinical deployment constraints. Error analysis of 225 discordant cases identified three primary patterns: over-recommendation (52.44%), unsuitable recommendation (27.56%), and clinically equivalent choices (20%). Critically, the framework achieves clinically viable accuracy without model retraining requirements-a pivotal advantage enabling significant utility in streamlining scanning operations and accelerating imaging workflow automation.