Back to all papers

Is deep learning ready for abdominal organ-at-risk segmentation in the foundation model era: A comprehensive study of challenging clinical cases.

April 8, 2026pubmed logopapers

Authors

Fu J,Li H,Luo Z,He Y,Zhang S,Zou X,Wu Y,Wang G,Liao W

Affiliations (8)

  • Department of Radiation Oncology, Precision Radiation in Oncology Key Laboratory of Sichuan Province, Sichuan Clinical Research Center for Cancer, Sichuan Cancer Hospital & Institute, Sichuan Cancer Center, University of Electronic Science and Technology of China, Chengdu, 610041, China; School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China.
  • School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China.
  • Department of Radiation Oncology, Anhui Provincial Hospital, University of Science and Technology of China, Hefei, 230061, China.
  • Department of Radiation Oncology, Precision Radiation in Oncology Key Laboratory of Sichuan Province, Sichuan Clinical Research Center for Cancer, Sichuan Cancer Hospital & Institute, Sichuan Cancer Center, University of Electronic Science and Technology of China, Chengdu, 610041, China.
  • Clinical Medical Research Center, The First People's Hospital of Kashi (Kashgar) Prefecture, Kashi, 844000, China; Xinjiang Key Laboratory of Artificial Intelligence Assisted Imaging Diagnosis, Kashi (Kashgar), 844000, China.
  • Xinjiang Key Laboratory of Artificial Intelligence Assisted Imaging Diagnosis, Kashi (Kashgar), 844000, China; Department of Hepatobiliary surgery, The First People's Hospital of Kashi (Kashgar) Prefecture, Kashi, 844000, China.
  • School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China; Shanghai AI Laboratory, Shanghai, 200030, China.
  • Department of Radiation Oncology, Precision Radiation in Oncology Key Laboratory of Sichuan Province, Sichuan Clinical Research Center for Cancer, Sichuan Cancer Hospital & Institute, Sichuan Cancer Center, University of Electronic Science and Technology of China, Chengdu, 610041, China. Electronic address: [email protected].

Abstract

The application of deep learning (DL)-based methods for accurate organ-at-risk (OAR) segmentation in challenging clinical scenarios remains unexplored. This study aims to evaluate state-of-the-art fully supervised learning (FSL) methods and foundation model (FM)-based methods across challenging clinical scenarios, and to propose an effective solution to improve model robustness and reduce organ hallucination. We retrospectively collected computed tomography (CT) scans from 413 patients across two institutions, divided into three cohorts based on treatment strategy. Seven FSL and six FM methods were comprehensively evaluated on an internal testing cohort (n=67, without surgery), external testing cohort 2 (n=22, partial organ resection surgery) and external testing cohort 3 (n=74, whole organ resection surgery) as well as three public datasets. We further introduced an organ erasure augmentation (OEA) strategy to improve generalization and address hallucinations in missing organs. Quantitative metrics included Dice similarity coefficient (DSC), normalized surface Dice (NSD) and hallucination ratio. Two of three fine-tuned FM methods failed to produce any segmentation outputs for 5 and 6 out of 19 organs, respectively. Prompt-based FM methods using tight bounding box prompts demonstrated stable performance but struggled with complex anatomy like intestine. Our proposed OEA method outperformed existing FM-based and FSL methods, achieving mean DSC and mean NSD of 87.29% and 87.84% on the internal testing cohort, 85.15% and 85.01% on the external testing cohort 2, and 82.29% and 81.81% on the external testing cohort 3, respectively. Compared with the best-performing method (nnUNet), our method reduced the mean hallucination ratio from 0.571 to 0.516 and demonstrated superior cross-dataset generalization with less performance degradation. Current FM-based and FSL methods remain insufficient for clinical use in cases involving irregular anatomy or significant distribution shifts. The proposed OEA strategy reduces hallucination and enhance segmentation robustness, offering a promising step toward reliable clinical application.

Topics

Journal Article

Ready to Sharpen Your Edge?

Subscribe to join 11k+ peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.