GPT-4o, with prompt engineering, selected optimal abdominal/pelvic CT protocols more accurately than radiologists without increasing inappropriate selections.
Key Details
- 1Study evaluated 1,448 abdominal and pelvic CT exams between Jan-June 2024.
- 2GPT-4o with detailed prompting selected optimal protocols 96.2% of the time, compared to 88.3% for radiologists (p<0.001).
- 3Rates of inappropriate protocols were similarly low: 1.3% (GPT-4o) vs. 2.4% (radiologists), not statistically significant (p=0.21).
- 4Fine-tuning GPT-4o offered no performance increase over meticulous prompting (both 96.2%).
- 5Performance in protocol matching was consistent across training levels (radiologist, fellow, resident).
Why It Matters

Source
AuntMinnie
Related News

XGBoost Outperforms Logistic Regression for Lung Cancer Risk Prediction
XGBoost-based lung cancer risk prediction model shows greater accuracy than logistic regression in a large screening cohort.

GPT-4o Surpasses Radiologists in CT Protocol Selection Using AI
GPT-4o, a large language model, demonstrates superior performance to radiologists in protocoling CT scans when provided with appropriate context.

Study Evaluates LLMs for Automated PI-RADS Classification in Prostate MRI Reports
Large language models demonstrate promising performance in automating PI-RADS classification from structured prostate MRI reports, with some limitations in intermediate-risk lesions.