DeepSeek-V3 and ChatGPT-4o excelled in accurately answering patient questions about interventional radiology procedures, suggesting LLMs' growing role in clinical communication.
Key Details
- 1Study evaluated four LLMs (ChatGPT-4o, DeepSeek-V3, OpenBioLLM-8b, BioMistral-7b) on 107 real-world patient questions covering TAPE, CT-guided HDR brachytherapy, and BEST.
- 2Questions and their answers were independently scored for accuracy by two board-certified radiologists using a Likert scale.
- 3DeepSeek-V3 achieved the highest mean scores for BEST (4.49) and CT-HDR (4.24), while matching ChatGPT-4o on TAPE (4.20 vs 4.17).
- 4OpenBioLLM-8b and BioMistral-7b scored significantly lower and produced potentially hazardous responses.
- 5LLMs' responses show promise for supporting—but not replacing—comprehensive medical consultations.
- 6Future studies should include patient feedback and focus on alignment with clinical guidelines.
Why It Matters

Source
AuntMinnie
Related News

AI Enables Safe 75% Gadolinium Reduction in Breast MRI Without Losing Sensitivity
AI-enhanced breast MRI with a 75% reduced gadolinium dose maintained diagnostic sensitivity comparable to full-dose protocols.

Deep Learning AI Model Detects Coronary Microvascular Dysfunction Via ECG
A new AI algorithm rapidly detects coronary microvascular dysfunction using ECGs, with validation incorporating PET imaging.

US Executive Order and HHS Strategy Set AI Policy Directions for Healthcare
The White House executive order and new HHS strategy shift US policy towards unified AI standards and expanded adoption in healthcare.