Natural Language Processing of Large Numbers of Radiology Reports in a Public Health System to Extract Structured Data, With a Test Case of CT KUB.

April 15, 2026

papers

DOI: 10.1111/1754-9485.70103 PMID: 41983585

Authors

Birry M,Roach C,Hunter-Philpott DA,Doyle AJ

Affiliations (3)

Department of Anatomy and Medical Imaging, University of Auckland, Auckland, New Zealand.
Amazon Web Services New Zealand, Auckland, New Zealand.
Waitemata District Health Board, Auckland, New Zealand.

Abstract

Natural language processing (NLP) was used to extract structured information from large numbers of radiology reports with the aim of showing the feasibility of this approach for system monitoring, conducting clinical research and improving practice. In total, 220,000 consecutive radiology reports were processed using an NLP pipeline (Radiology Text Analysis, or RATA). The indications, modality, technique, anatomy and findings were mapped to SNOMED CT codes. A subset of 941 reports identified as CT-KUB was analysed to examine the pipeline's performance in detecting renal tract stones (RTS), compared with a manual reference standard. The Fisher exact and Cohen kappa tests were applied. Compared with the reference standard, RATA had accuracy 95%, sensitivity 94%, specificity 97%, positive predictive value 98% and negative predictive value 91%, with kappa statistic 0.9. Sub analysis showed that, of 366 females, 50% had negative RTS diagnoses, while only 32% of 566 males had negative RTS diagnoses (two-tailed p < 0.00001). The RATA pipeline has acceptable performance in extracting structured data from large numbers of radiology reports. Clinically relevant information such as variations in use can be uncovered.

View Source Full Text PDF

Topics

Journal Article

Natural Language Processing of Large Numbers of Radiology Reports in a Public Health System to Extract Structured Data, With a Test Case of CT KUB.

Authors

Affiliations (3)

Abstract

Tags

Topics

Ready to Sharpen Your Edge?