Multimodal LLMs Achieve High Accuracy Detecting Scoliosis on X-rays

Multimodal LLMs achieved up to 94% accuracy for scoliosis detection on spine x-rays, but struggled with lumbar stenosis on MRI.

Key Details

1Five multimodal LLMs tested: Grok 2, 3, 4, ChatGPT 4o, Gemini 1.5 Flash.
2171 spine x-rays (100 scoliosis, 71 normal) and 200 lumbar spine MRIs (100 severe stenosis, 100 normal) used in the study.
3Best x-ray result: Grok 4 with 94.2% accuracy for scoliosis detection; best MRI result: Gemini at 60% for stenosis.
4ChatGPT 4o showed better confidence calibration when incorrect, considered a 'superior metacognitive capability.'
5Authors emphasize LLMs not ready for clinical diagnosis; highlight potential for patient education in obvious cases.
6Study published in World Neurosurgery on May 2, 2024.

Why It Matters

As patients increasingly use commercial LLMs for medical advice, understanding their capabilities and risks in radiology is crucial. These results highlight both the promise and current limitations of generalist AI in medical image interpretation, especially for more subtle pathologies.

Read the full article on AuntMinnie

Multimodal LLMs Achieve High Accuracy Detecting Scoliosis on X-rays

Key Details

Why It Matters

Related News

AI Technique Unveils Previously Hidden MS Gray Matter Lesions on MRI

Majority of Patients Want Disclosure When AI Used in Imaging

Generative AI Set to Transform Chest X-ray Reporting and Quality

Ready to Sharpen Your Edge?