LLMs like ChatGPT-4o and AmbossGPT can accurately classify bone fractures in CT radiology reports, aiding radiologists.
Key Details
- 1Study assessed four LLMs (ChatGPT-4o, AmbossGPT, Claude 3.5 Sonnet, Gemini 2.0 Flash) on 292 artificial CT reports representing 310 fractures.
- 2ChatGPT-4o and AmbossGPT showed highest overall classification accuracy (74.6% and 74.3%).
- 3Bone recognition rates were high for all models (90%-99%), but fracture subtype classification was lower (71%-77%).
- 4Statistically significant accuracy differences were noted between LLMs by fracture type and anatomical location.
- 5Validation with real-world reports (145 fractures) using LLaMA 3.3-70B yielded similar results to artificial datasets (~70% performance).
- 6Authors note need for further validation on large, multi-center real-world datasets.
Why It Matters

Source
AuntMinnie
Related News

AI Models Reveal Racial Disparities in Breast Cancer Patterns
Machine learning models reveal significant racial disparities and key predictors in breast cancer incidence across diverse groups.

AI Algorithm Streamlines and Standardizes Shoulder Ultrasound Acquisition
A multitask AI system demonstrated high accuracy in standardizing and guiding shoulder musculoskeletal ultrasound imaging.

Deepfake X-rays Fool Radiologists and AI, Raising Security Concerns
Both radiologists and AI models struggle to differentiate between authentic and AI-generated ('deepfake') radiographic images, raising major security and clinical concerns.