DeepSeek-V3 and ChatGPT-4o excelled in accurately answering patient questions about interventional radiology procedures, suggesting LLMs' growing role in clinical communication.
Key Details
- 1Study evaluated four LLMs (ChatGPT-4o, DeepSeek-V3, OpenBioLLM-8b, BioMistral-7b) on 107 real-world patient questions covering TAPE, CT-guided HDR brachytherapy, and BEST.
- 2Questions and their answers were independently scored for accuracy by two board-certified radiologists using a Likert scale.
- 3DeepSeek-V3 achieved the highest mean scores for BEST (4.49) and CT-HDR (4.24), while matching ChatGPT-4o on TAPE (4.20 vs 4.17).
- 4OpenBioLLM-8b and BioMistral-7b scored significantly lower and produced potentially hazardous responses.
- 5LLMs' responses show promise for supporting—but not replacing—comprehensive medical consultations.
- 6Future studies should include patient feedback and focus on alignment with clinical guidelines.
Why It Matters

Source
AuntMinnie
Related News

Study Highlights Limitations of AI in Prostate MRI Screening
New research points to several shortcomings in implementing AI for MRI-based prostate cancer screening.

Deep Learning Model Predicts Brain Tumor MRI Enhancement Without Gadolinium
German researchers developed a deep learning approach to predict MRI contrast enhancement in brain tumors without the need for gadolinium-based agents.

SimonMed Imaging Introduces Paid AI Add-Ons for Routine Exams
SimonMed Imaging is launching new AI-powered elective services for routine imaging exams with additional out-of-pocket costs for patients.