Development and retrospective validation of SCOUT: scalable clinical oversight of large language models via uncertainty triangulation

February 10, 2026

DOI: 10.64898/2026.02.08.26345860

Authors

Ba, Z.,He, M.,He, H.,Fu, Q.,Lai, J.,Zhang, R.,Diao, X.,Liu, M.,Wang, Z.,Wang, X.,Zhao, S.,Zhu, Y.,Chen, H.,Qiu, Y.,Su, Q.,Xu, J.,Hu, F.,Luo, X.,Chen, H.,Zheng, M.,Xu, B.,Liu, J.,Guo, N.,Gao, X.,Wang, G.,Wu, Y.

Affiliations (1)

Department of Cardiology, State Key Laboratory of Cardiovascular Disease, National Center for Cardiovascular Diseases, Fuwai Hospital, Chinese Academy of Medica

Abstract

Large language models (LLMs) are increasingly used in clinical workflows, yet requiring clinician review of every AI output negates the efficiency gains that motivate their adoption. We present SCOUT (Scalable Clinical Oversight via Uncertainty Triangulation), a model-agnostic meta-verification framework that selectively defers unreliable LLM predictions to clinicians by triangulating three orthogonal signals: model heterogeneity, stochastic inconsistency, and reasoning critique. In this retrospective development and validation study, we derived the framework on a discovery cohort (n = 405) and validated it across three clinically distinct tasks using 4 independent retrospective cohorts: coronary heart disease subtyping (n = 2,271), liver cancer screening from radiology reports (n = 3,373), and diseased coronary vessel counting (n = 286). SCOUT reduced the volume of cases requiring human review by 45% to 83%, with projected final accuracy of 99.1% to 100.0% assuming expert correction of all flagged cases. SCOUT provides a scalable, retrospectively validated approach for deploying generative AI in clinical medicine without compromising patient safety. Prospective randomized validation is underway to confirm real-world clinical utility.

View Source Full Text PDF

Topics

cardiovascular medicine

Development and retrospective validation of SCOUT: scalable clinical oversight of large language models via uncertainty triangulation

Authors

Affiliations (1)

Abstract

Tags

Topics

Ready to Sharpen Your Edge?