Large Language Models Rival Physicians in Complex Lung Cancer Decisions

A real-world study reveals that large language models (LLMs) can match or exceed human physicians' performance in challenging lung cancer case decision-making, especially for rare cases.
Key Details
- 150 challenging lung cancer cases (complex, rare, refractory) were evaluated using blinded, multidimensional scoring by experts.
- 2LLMs reviewed: DeepSeek R1, Claude 3.5, Gemini 1.5, and GPT-4o; physician decisions stratified by experience; some juniors received AI assistance.
- 3DeepSeek R1 performed between intermediate and senior physicians overall; LLMs outperformed intermediates in rare cases but lagged in refractory (longitudinal) cases.
- 4AI-augmented junior physicians saw 80-90% boosts in comprehensiveness and specificity for rare cases, but specificity slightly dropped for refractory cases.
- 5Error profiling showed LLMs are strong in knowledge breadth/updates, while physicians excel in longitudinal reasoning and stability.
Why It Matters

Source
EurekAlert
Related News

Optical AI Chip Boosts Real-Time Dry Eye Gland Diagnosis Accuracy
A new metasurface spectral AI chip enables rapid, accurate diagnosis of meibomian gland dysfunction (MGD) from tissue samples, achieving 96.22% accuracy.

New AI Vision-Language Model Enhances Chest CT Diagnostics
Researchers developed an interpretable AI model that uses visual question answering to generate detailed diagnostic findings from chest CT scans, aimed at improving lung cancer diagnosis.

AI Analyzes 66,000 MRI Scans to Map Body Composition Risks
Researchers used AI to analyze over 66,000 whole-body MRI scans, creating a detailed body composition reference map linked to health risks.