Reinforcement learning improves LLM accuracy and reasoning in disease classification from radiology reports.

April 30, 2026

papers

DOI: 10.1038/s41746-026-02685-4 PMID: 42062541

Authors

Wei Y,Lin Y,Flanders A,Shih G,Peng Y

Affiliations (5)

Department of Population Health Sciences, Weill Cornell Medicine, New York, NY, USA.
Department of Radiology, Weill Cornell Medicine, New York, NY, USA.
Department of Radiology, Thomas Jefferson University, Philadelphia, PA, USA.
Department of Population Health Sciences, Weill Cornell Medicine, New York, NY, USA. [email protected].
Department of Radiology, Weill Cornell Medicine, New York, NY, USA. [email protected].

Abstract

Accurate disease classification from radiology reports is essential for many applications. While supervised fine-tuning (SFT) of lightweight LLMs improves accuracy, it can degrade reasoning. We propose a two-stage approach: SFT on disease labels followed by Group Relative Policy Optimization (GRPO) to refine predictions by optimizing accuracy and format without reasoning supervision. Across three radiologist-annotated datasets, SFT outperformed baselines and GRPO further improved classification and enhanced reasoning recall and comprehensiveness.

View Source Full Text PDF

Topics

Journal Article

Reinforcement learning improves LLM accuracy and reasoning in disease classification from radiology reports.

Authors

Affiliations (5)

Abstract

Tags

Topics

Ready to Sharpen Your Edge?