New multimodal large language models (LLMs) like OpenAI o3 and Gemini 2.5 Pro demonstrated significant advancements in answering Japanese radiology board exam questions, particularly with image input.
Key Details
- 1Eight LLMs were tested on the Japan Diagnostic Radiology Board Examination (JDRBE).
- 2OpenAI o3 achieved 67% accuracy (text-only) and 72% with image input.
- 3Gemini 2.5 Pro also showed notable accuracy improvements with image data.
- 4Both OpenAI o3 and Gemini 2.5 Pro received higher legitimacy scores from radiologist raters than some competitors.
- 5The test set included 233 questions and 477 images (184 CT, 159 MRI, 15 x-ray, 90 nuclear medicine).
- 6Image input statistically improved diagnostic accuracy for several models.
Why It Matters

Source
AuntMinnie
Related News

RadNet Study: AI Boosts Breast Cancer Detection in Largest-Ever Real-World Analysis
A massive real-world study by RadNet shows AI-assisted mammography increased breast cancer detection by 21.6%.

Multimodal MRI Radiomics Model Predicts Long-Term Survival in Breast Cancer
A multimodal MRI radiomics and deep learning model outperformed traditional models in predicting 5- and 7-year survival for breast cancer patients receiving neoadjuvant chemotherapy.

AI Predicts 10-Year Mortality and Hip Fracture Risk from DEXA Scans
A self-supervised AI model predicts 10-year mortality and hip fractures using only DEXA scans.