Automated real-time assessment of intracranial hemorrhage detection AI using an ensembled monitoring model (EMM).
Authors
Affiliations (6)
Affiliations (6)
- Department of Radiology, School of Medicine, Stanford University, Stanford, CA, 94304, USA. [email protected].
- AI Development and Evaluation Laboratory (AIDE), School of Medicine, Stanford University, Stanford, CA, 94304, USA. [email protected].
- Department of Radiology, School of Medicine, Stanford University, Stanford, CA, 94304, USA.
- AI Development and Evaluation Laboratory (AIDE), School of Medicine, Stanford University, Stanford, CA, 94304, USA.
- 3D and Quantitative Imaging Laboratory (3DQ), School of Medicine, Stanford University, Stanford, CA, 94304, USA.
- Department of Biomedical Data Science, School of Medicine, Stanford University, Stanford, CA, 94304, USA.
Abstract
Artificial intelligence (AI) tools for radiology are commonly unmonitored once deployed. The lack of real-time case-by-case assessments of AI prediction confidence requires users to independently distinguish between trustworthy and unreliable AI predictions, which increases cognitive burden, reduces productivity, and potentially leads to misdiagnoses. To address these challenges, we introduce Ensembled Monitoring Model (EMM), a framework inspired by clinical consensus practices using multiple expert reviews. Designed specifically for black-box commercial AI products, EMM operates independently without requiring access to internal AI components or intermediate outputs, while still providing robust confidence measurements. Using intracranial hemorrhage detection as our test case on a large, diverse dataset of 2919 studies, we demonstrate that EMM can successfully categorize confidence in the AI-generated prediction, suggest appropriate actions, and help physicians recognize low confidence scenarios, ultimately reducing cognitive burden. Importantly, we provide key technical considerations and best practices for successfully translating EMM into clinical settings.