Large Language Models Rival Physicians in Complex Lung Cancer Decisions

A real-world study reveals that large language models (LLMs) can match or exceed human physicians' performance in challenging lung cancer case decision-making, especially for rare cases.
Key Details
- 150 challenging lung cancer cases (complex, rare, refractory) were evaluated using blinded, multidimensional scoring by experts.
- 2LLMs reviewed: DeepSeek R1, Claude 3.5, Gemini 1.5, and GPT-4o; physician decisions stratified by experience; some juniors received AI assistance.
- 3DeepSeek R1 performed between intermediate and senior physicians overall; LLMs outperformed intermediates in rare cases but lagged in refractory (longitudinal) cases.
- 4AI-augmented junior physicians saw 80-90% boosts in comprehensiveness and specificity for rare cases, but specificity slightly dropped for refractory cases.
- 5Error profiling showed LLMs are strong in knowledge breadth/updates, while physicians excel in longitudinal reasoning and stability.
Why It Matters

Source
EurekAlert
Related News

AI-Powered OCT Enables Rapid 'Optical Biopsy' for Early Endometrial Cancer Detection
A team at Washington University has developed a catheter-based 3D OCT system with AI to quickly and noninvasively detect early endometrial cancers.

AI Clinical Reasoning in Diagnostics and Digital Fatigue in Healthcare
Recent JMIR features explore large language models in clinical diagnostics and digital fatigue among healthcare professionals.

KAIST, MIT, Microsoft Develop Efficient AI Image Upsampling for Robotics
KAIST, MIT, and Microsoft have created 'Upsample Anything,' a training-free AI method to restore high-resolution visual data from compressed images with up to 16x improved GPU memory efficiency.