Back to all papers

DISCERN: A Clinical Impact-aware Framework for Radiology Report Comparison

May 28, 2026medrxiv logopreprint

Authors

Sharma, R.,Beeche, C.,Dong, J.,Zhuang, R.,Qu, H.,Zhang, R.,Gangaram, V.,Goswami, P.,Xin, J.,Ballard, J.,Duda, J.,Kahn, C. E.,Goldberg, A.,Sagreiya, H.,Long, Q.,Chen, T.,Witschey, W. R.

Affiliations (1)

  • University of Pennsylvania

Abstract

The surge in medical imaging has spurred the development of vision-language models (VLMs) to alleviate radiologist workloads. However, clinical deployment is hindered by the lack of meaningful evaluation frameworks. Current metrics - ranging from semantic similarity to large language model (LLM) based judges - often fail to distinguish between clinically trivial and critical discrepancies, poorly reflecting real-world clinical judgment. To address this, we introduce DISCERN (Discordance and Significance-aware Entity-level Radiology Report Comparison). DISCERN is a significance-aware framework that weighs report errors based on their potential impact on patient care. Our results demonstrate that DISCERN powered by closed source LLMs aligns more closely with expert radiologist assessments than traditional metrics or current LLM evaluators, providing a more interpretable and clinically relevant benchmark. By modeling radiologist prioritization and entity-level feedback, DISCERN facilitates targeted model refinement and ensures the safer integration of generative AI into clinical workflows.

Topics

radiology and imaging

Ready to Sharpen Your Edge?

Subscribe to join 11k+ peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.