DeepSeek-V3 and ChatGPT-4o excelled in accurately answering patient questions about interventional radiology procedures, suggesting LLMs' growing role in clinical communication.
Key Details
- 1Study evaluated four LLMs (ChatGPT-4o, DeepSeek-V3, OpenBioLLM-8b, BioMistral-7b) on 107 real-world patient questions covering TAPE, CT-guided HDR brachytherapy, and BEST.
- 2Questions and their answers were independently scored for accuracy by two board-certified radiologists using a Likert scale.
- 3DeepSeek-V3 achieved the highest mean scores for BEST (4.49) and CT-HDR (4.24), while matching ChatGPT-4o on TAPE (4.20 vs 4.17).
- 4OpenBioLLM-8b and BioMistral-7b scored significantly lower and produced potentially hazardous responses.
- 5LLMs' responses show promise for supporting—but not replacing—comprehensive medical consultations.
- 6Future studies should include patient feedback and focus on alignment with clinical guidelines.
Why It Matters

Source
AuntMinnie
Related News

Deep Learning AI Outperforms Radiologists in Detecting ENE on CT
A deep learning tool, DeepENE, exceeded radiologist performance in identifying lymph node extranodal extension in head and neck cancers using preoperative CT scans.

Patients Favor AI in Imaging Diagnostics, Hesitate on Triage Use
Survey finds most patients support AI in diagnostic imaging but are reluctant about its use in triage decisions.

FDA Clears Multi-Disease AI Screening Platform for CT Imaging
HeartLung Corporation's AI-CVD platform receives FDA clearance to detect multiple diseases from a single CT scan.