BoneCoT: multicentre validation of a whole-body skeleton foundation model for bone metastases guided by clinician-derived chain of thought.
Authors
Affiliations (6)
Affiliations (6)
- Metastatic Bone Tumor Clinical Center, Shanghai Sixth People's Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai, China. [email protected].
- Institute of Diagnostic and Interventional Radiology, Shanghai Sixth People's Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai, China.
- Metastatic Bone Tumor Clinical Center, Shanghai Sixth People's Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai, China.
- Mailman School of Public Health, Columbia University, New York, NY, USA.
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, WA, USA. [email protected].
- Institute of Diagnostic and Interventional Radiology, Shanghai Sixth People's Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai, China. [email protected].
Abstract
Given the rising incidence of bone metastases, computed tomography is widely used worldwide as the initial imaging modality for their detection. Accurate diagnosis of bone metastases demands comprehensive evaluation, yet divergent interpretations among specialists can result in diagnostic discrepancies. In clinical practice, precision diagnosis of bone metastases necessitates multidisciplinary collaboration involving radiologists, pathologists and oncologists. Here, to meet the need for an automated tool that can deliver expert-level insights and predictions by jointly considering multidisciplinary information, we propose BoneCoT, a whole-body skeleton foundation model enhanced through a chain-of-thought (CoT) fine-tuning approach. We pretrained the model on 29.3 million computed tomography images from 30,267 patients across 12 skeletal sites and refined it over a graph of 26 clinically relevant tasks spanning diagnosis, complications, tumour type and biomarkers. Evaluated across 26 tasks and multicentre cohorts from 10 hospitals, BoneCoT outperformed state-of-the-art methods by 20% in area under the receiver operating characteristic curve. Critically, BoneCoT achieved a 40% area under the receiver operating characteristic curve improvement in distinguishing primary from metastatic lesions, significantly surpassing experienced radiologists. These findings show how clinician-derived reasoning can move artificial intelligence towards more integrated diagnostic assessment in complex disease.