Back to all papers

Adapting and Evaluating Multimodal Large Language Models for Adolescent Idiopathic Scoliosis Self-Management: A Divide and Conquer Framework

September 15, 2025arxiv logopreprintarXiv: 2509.11645v1

Authors

Zhaolong Wu,Pu Luo,Jason Pui Yin Cheung,Teng Zhang

Abstract

This study presents the first comprehensive evaluation of Multimodal Large Language Models (MLLMs) for Adolescent Idiopathic Scoliosis (AIS) self-management. We constructed a database of approximately 3,000 anteroposterior X-rays with diagnostic texts and evaluated five MLLMs through a `Divide and Conquer' framework consisting of a visual question-answering task, a domain knowledge assessment task, and a patient education counseling assessment task. Our investigation revealed limitations of MLLMs' ability in interpreting complex spinal radiographs and comprehending AIS care knowledge. To address these, we pioneered enhancing MLLMs with spinal keypoint prompting and compiled an AIS knowledge base for retrieval augmented generation (RAG), respectively. Results showed varying effectiveness of visual prompting across different architectures, while RAG substantially improved models' performances on the knowledge assessment task. Our findings indicate current MLLMs are far from capable in realizing personalized assistant in AIS care. The greatest challenge lies in their abilities to obtain accurate detections of spinal deformity locations (best accuracy: 0.55) and directions (best accuracy: 0.13).

Topics

cs.AI

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.