A fine-tuned large language model chatbot for multi-scenario radiology cancer care: randomized controlled trial on interaction optimization, emotional support, and provider burnout reduction.
Authors
Affiliations (7)
Affiliations (7)
- Department of Radiology, Chongqing Hospital of Traditional Chinese Medicine, Chongqing, 400021, China.
- Faculty of Psychology, Southwest University, Chongqing, China.
- Department of Radiology, Chengdu BOE Hospital, Chengdu, 610200, China.
- Department of Radiology, The First Affiliated Hospital of Chongqing Medical University, Chongqing, 400016, China.
- Department of Radiology, Sichuan Cancer Hospital & Institute, Sichuan Cancer Center, School of Medicine, University of Electronic Science and Technology of China, Chengdu, 610042, China.
- College of Computer & Information Science, Southwest University, Chongqing, 400715, China.
- Department of Radiology, The First Affiliated Hospital of Chongqing Medical University, Chongqing, 400016, China. [email protected].
Abstract
Cancer patients are more prone to depression and anxiety symptoms compared to those with chronic diseases. Amidst surging clinical demands and constrained medical resources, the traditional radiology workflows, plagued by inefficient communication, exacerbates both patients' psychological distress and healthcare providers' burnout. To develop and validate a fine-tuned DeepSeek R1-based Radiology Examination Chatbot (REC) to optimize clinical interaction between cancer patients and radiology healthcare providers (RHPs), thereby effectively providing emotional support for cancer patients and reducing burnout among RHPs. Audio recordings of multi-scenarios (appointment triage (AT), pre-examination preparation (PP), radiology clinic services (RCS)) were collected from the radiology departments of three tertiary hospitals (nβ=β36,511Β min). This study conducts two independent randomized controlled sub-trials for distinct patient groups: Sub-trial 1 evaluates AT/PP participants (1,424 patients, 1:1 randomized to RHPβ+βREC or RHP), while Sub-trial 2 assesses RCS participants (638 patients, 1:1 randomized to the same groups). Due to differing patient populations, the sub-trials were designed and implemented separately. The REC was fine-tuned using domain-specific dialogue data (80% for training) and scenario-specific prompts, with GPT-o1 as a comparative benchmark. Sub-trials randomized patients to RHPβ+βREC or RHP groups. The primary outcome included dialogue quality (empathy, frustration, emotional regulation, factuality, integrity, and satisfaction), while the secondary outcomes comprised burnout (exhaustion, depersonalization, and personal achievement) and image quality (CT/MRI), all assessed via Likert scales and statistical tests. RHPβ+βREC group demonstrated superior dialogue quality in AT (factuality: 4.12βΒ±β0.86 vs. 3.39βΒ±β1.21, Pβ<β0.001) and PP (satisfaction: 3.73βΒ±β0.11 vs. 3.19βΒ±β0.18, Pβ<β0.001), with reduced burnout (exhaustion: 1.85βΒ±β0.91 vs. 2.40βΒ±β1.22, Pβ<β0.01). CT image quality improved significantly (4.35βΒ±β0.51 vs. 4.00βΒ±β0.52, Pβ<β0.01), and similar results were achieved in MRI examinations (4.12βΒ±β0.51 vs. 3.79βΒ±β0.58, Pβ=β0.02). However, REC underperformed in empathy and emotional regulation during emotionally complex RCS (3.88βΒ±β0.67 vs. 4.42βΒ±β0.53, Pβ=β0.002; 3.87βΒ±β0.19 vs. 4.12βΒ±β0.27, Pβ=β0.004). Ablation studies confirmed the necessity of fine-tuning and scenario-specific prompts for performance. Overall, the DeepSeek R1-based REC synergistically enhanced multi-scenarios clinical interaction, provided emotional support for cancer patients, and reduced RHPs' burnout, offering a scalable solution to optimize radiology workflows. Chinese Clinical Trial Registry Registration number: (ChiCTR2500102740||http://www.chictr.org.cn/), Registration Date: 2025-05-26.