Back to all papers

ModernBERT is more efficient than conventional BERT for chest CT findings classification in Japanese radiology reports.

April 3, 2026pubmed logopapers

Authors

Yamagishi Y,Kikuchi T,Hanaoka S,Yoshikawa T,Abe O

Affiliations (5)

  • Division of Radiology and Biomedical Engineering, Graduate School of Medicine, The University of Tokyo, 7-3-1 Hongo, Bunkyo-Ku, Tokyo, 113-8655, Japan. [email protected].
  • Department of Computational Diagnostic Radiology and Preventive Medicine, The University of Tokyo Hospital, Tokyo, Japan.
  • Department of Radiology, School of Medicine, Jichi Medical University, Shimotsuke, Tochigi, Japan.
  • Division of Radiology and Biomedical Engineering, Graduate School of Medicine, The University of Tokyo, 7-3-1 Hongo, Bunkyo-Ku, Tokyo, 113-8655, Japan.
  • Department of Radiology, The University of Tokyo Hospital, Tokyo, Japan.

Abstract

Japanese language models for medical text classification face challenges with complex vocabulary and linguistic structures in radiology reports. This study compared three Japanese models-BERT Base, JMedRoBERTa, and ModernBERT-for multi-label classification of 18 chest CT findings. Using the CT-RATE-JPN dataset, all models were fine-tuned under identical conditions. ModernBERT showed clear efficiency advantages, producing substantially fewer tokens and achieving faster training and inference than the other models while maintaining comparable performance on the internal test dataset (exact match accuracy: 74.7% vs. 72.7% for BERT Base). To assess generalizability, we additionally constructed RR-Findings, an external dataset of 243 naturally written Japanese radiology reports annotated using the same schema. Under this domain-shifted setting, performance differences became pronounced: BERT Base outperformed both JMedRoBERTa and ModernBERT, whereas ModernBERT showed the largest decline in exact match accuracy. Average precision differences were smaller, indicating that ModernBERT retained reasonable ranking ability despite reduced calibration. Overall, ModernBERT offers substantial computational efficiency and strong in-domain performance but remains sensitive to real-world linguistic variability. These results highlight the need for more diverse natural-language training data and domain-specific calibration strategies to improve robustness when deploying modern transformer models in heterogeneous clinical environments.

Topics

Journal Article

Ready to Sharpen Your Edge?

Subscribe to join 11k+ peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.