Task-Specific NLP Outperforms General LLMs for Lung Nodule Detection in Chest CT Reports
A radiology-trained NLP model significantly outperformed several general-purpose large language models in extracting incidental lung nodule data from chest CT reports.
Key Details
- 1Task-specific NLP model ('FiNd') was compared with seven general-purpose LLMs (Gemma, Haiku, Sonnet 2, GPT-4o, DeepSeek, Phi-4, MedGemma).
- 2FiNd was developed using 21,542 radiology reports and tested on 1,016 chest CT reports.
- 3Performance was assessed for identifying incidental lung nodules (ILNs) and categorizing them by size (<6 mm, 6–7.9 mm, ≥8 mm).
- 4FiNd achieved 96.8% accuracy for nodules ≥6 mm and 97.4% for nodules ≥8 mm, outperforming all general LLMs.
- 5General-purpose LLM accuracy ranged from 77.7% to 88.6% for detection tasks; specificity varied widely among models.
- 6Researchers urge adaptation of LLMs using radiology-specific training for better clinical integration.
Why It Matters

Source
AuntMinnie
Related News

Stanford Launches Merlin: 3D AI Model for Abdominal CT Interpretation
Stanford researchers introduce Merlin, a 3D vision-language AI model for interpreting abdominal CT scans, demonstrating strong performance across multiple radiology tasks.

RadNet Acquires AI Firm Gleamer in $270M Deal to Expand Radiology Solutions
RadNet will acquire Gleamer for up to $270 million, aiming to make DeepHealth the largest global provider of radiology clinical AI solutions.

ChatGPT Use Soars in Radiology Research Abstracts Since 2023
Radiology research abstracts show a marked rise in LLM-assisted editing since the release of ChatGPT.