DeepSeek-V3 and ChatGPT-4o excelled in accurately answering patient questions about interventional radiology procedures, suggesting LLMs' growing role in clinical communication.
Key Details
- 1Study evaluated four LLMs (ChatGPT-4o, DeepSeek-V3, OpenBioLLM-8b, BioMistral-7b) on 107 real-world patient questions covering TAPE, CT-guided HDR brachytherapy, and BEST.
- 2Questions and their answers were independently scored for accuracy by two board-certified radiologists using a Likert scale.
- 3DeepSeek-V3 achieved the highest mean scores for BEST (4.49) and CT-HDR (4.24), while matching ChatGPT-4o on TAPE (4.20 vs 4.17).
- 4OpenBioLLM-8b and BioMistral-7b scored significantly lower and produced potentially hazardous responses.
- 5LLMs' responses show promise for supporting—but not replacing—comprehensive medical consultations.
- 6Future studies should include patient feedback and focus on alignment with clinical guidelines.
Why It Matters

Source
AuntMinnie
Related News

AI-Based Slab Reconstruction Streamlines Digital Breast Tomosynthesis
AI-driven slab reconstruction in DBT improves workflow efficiency without compromising diagnostic accuracy in breast cancer screening.

AI Model Predicts Dosimetry for Lu-177 PSMA Therapy Using PET/CT
A machine learning PET/CT model shows promise for predicting radiation dose prior to Lu-177 PSMA therapy in prostate cancer patients.

AI Model Uses Ultrasound to Assess Fetal Lung Maturity
Researchers demonstrated an AI model's strong accuracy in measuring fetal lung maturity from ultrasound images.