GPT-4o matched the performance of experienced radiologists and surpassed residents in recommending follow-up imaging from routine radiology reports.
Key Details
- 1Study involved 100 CT/MRI oncologic cases across head and neck, liver, lung, and pancreas from two academic medical centers.
- 2GPT-4o, a radiology resident, and a board-certified radiologist each generated follow-up recommendations from report texts.
- 3Blinded senior radiologists rated recommendations for completeness, modality, timing, and overall quality on a five-point scale.
- 4GPT-4o achieved a median global quality score of 4 (same as board-certified, higher than resident), with 96% timing correctness and 92% completeness.
- 5GPT-4o showed the strongest performance in lung imaging (100% timing correctness).
- 6No significant differences found among readers for appropriateness of imaging modality.
Why It Matters

Source
AuntMinnie
Related News

Stanford Study: LLM-Generated Hospital Notes Safe, Aid Physician Wellbeing
Stanford research shows agentic LLMs can safely draft hospital discharge summaries, reducing physician burnout with minimal risk of patient harm.

SimonMed Imaging Introduces Paid AI Add-Ons for Routine Exams
SimonMed Imaging is launching new AI-powered elective services for routine imaging exams with additional out-of-pocket costs for patients.

Multimodal LLMs Achieve High Accuracy Detecting Scoliosis on X-rays
Multimodal LLMs achieved up to 94% accuracy for scoliosis detection on spine x-rays, but struggled with lumbar stenosis on MRI.