Task-Specific NLP Outperforms General LLMs for Lung Nodule Detection in Chest CT Reports
A radiology-trained NLP model significantly outperformed several general-purpose large language models in extracting incidental lung nodule data from chest CT reports.
Key Details
- 1Task-specific NLP model ('FiNd') was compared with seven general-purpose LLMs (Gemma, Haiku, Sonnet 2, GPT-4o, DeepSeek, Phi-4, MedGemma).
- 2FiNd was developed using 21,542 radiology reports and tested on 1,016 chest CT reports.
- 3Performance was assessed for identifying incidental lung nodules (ILNs) and categorizing them by size (<6 mm, 6–7.9 mm, ≥8 mm).
- 4FiNd achieved 96.8% accuracy for nodules ≥6 mm and 97.4% for nodules ≥8 mm, outperforming all general LLMs.
- 5General-purpose LLM accuracy ranged from 77.7% to 88.6% for detection tasks; specificity varied widely among models.
- 6Researchers urge adaptation of LLMs using radiology-specific training for better clinical integration.
Why It Matters

Source
AuntMinnie
Related News

Study: Computer Vision Models Best LLMs in Chest CT Breast Abnormality Detection
Computer vision models (CVMs) surpass large language models (LLMs) in accurately labeling incidental breast abnormalities on chest CT scans.

Radiology Maintains Lead in FDA-Cleared AI Algorithms, Cardiology Follows
Radiology remains the top specialty for FDA-cleared AI, with cardiology as a strong second, particularly in cardiovascular imaging.

Deep Learning Models Rival Radiologists for Pancreatic Cancer Detection on CT
Deep-learning models achieved comparable or superior accuracy to experienced radiologists in detecting pancreatic cancer on CT scans, especially for small tumors.