From text to tables: Zero-shot extraction of structured clinical data from free-text CT scan reports using foundational large language models

October 7, 2025

preprint

DOI: 10.1101/2025.10.05.25337366

Authors

Hongslo, A.,Gupta, A.,Nguyen, Q.,Caldwell, J.,Choi, B.,Harvey, C. J.,Thompson, J.,Mazzotti, D.,Yao, Z.,Noheria, A.

Affiliations (1)

Department of Cardiovascular Medicine, University of Kansas Medical Center, Kansas City, KS

Abstract

BackgroundLarge language models (LLMs) are being explored for multiple applications in medical research, including medical text classification. We evaluate the performance of 5 off-the-shelf LLMs for classifying free-text CT angiography reports for pulmonary embolism (PE)- related diagnostic labels. MethodsWe assessed 1,025 manually labeled CT reports using 5 LLMs (ChatGPT-4o, Llama 3.3 70b, Llama 3.1 8b, Llama 3.2 3b, Mixtral 8x7b) with zero-shot prefix prompts. Labels included acute PE, bilateral PE, and large PE. Voting ensemble models combining multiple LLM outputs were also tested. ResultsLlama 3.3 70b and ChatGPT-4o outperformed smaller models for all classification tasks. Highest accuracies were 96.6% (acute PE), 92.7% (bilateral PE), and 82.6% (large PE). Voting ensemble models offered no or minimal improvement in classification performance. ConclusionsOff-the-shelf LLMs, particularly larger ones, can classify free-text reports with high accuracy using simple prompts. Further work is needed to optimize prompting strategies and evaluate hybrid approaches.

View Source Full Text PDF

Topics

health informatics

From text to tables: Zero-shot extraction of structured clinical data from free-text CT scan reports using foundational large language models

Authors

Affiliations (1)

Abstract

Tags

Topics

Ready to Sharpen Your Edge?